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ABSTRACT 

''Using  the  maximum- 1  ikel ihood  estimation  method  and  minimization 
tecnniques,  quasi -geos  trophic  wave  solutions  were  fitted  to  the 
observations  of  the  1981  Ocean  Acoustic  Tomography  Experiment.  The 
experiment  occupied  a  300  km  square  area  centered  at  26°N,  70°W,  and 
had  a  duration  of  ~80  days.  The  data  set  consisted  of  acoustic 
travel-time  records,  temperature  records  and  CTD  profiles,  obtained 
from  the  acoustic  tomographic  array,  moored  temperature  sensors  and 
recorders,  and  ship  surveys,  respectively.  While  the  latter  two 
were  conventional  spot  measurements,  the  former  corresponds  to 
integral  measurements  of  the  temperature  (or  sound-speed)  field. 


The  optimal  fit  to  the  data  corresponded  to  3  waves  in  the  first 
baroclinic  mode,  evolving  under  the  presence  of  a  westward  mean  flow 


with  vertical  shear.  The  flow  was  estimated  to  be  weak  (~2  cnv/s), 
but  it  changed  the  wave  periods  significantly  by  producing  large 
Doppler  shifts.  The  waves  were  dynamically  stable  to  the  mean  flow, 
had  weak  nonlinear  interactions  with  each  other  and  did  not  form  a 
resonant  traid;  thus  they  constituted  a  fully  linear  solution. 

Evidence  for  the  existence  of  the  waves  was  strongly  supported 
by  the  high  correlation  ( ~0 . 9 )  between  the  data  and  the  fit,  the 
large  amount  of  signal  energy  resolved  ( ~80  percent),  the  excellent 
quality  of  the  wave -parameter  estimate  (only  about  10  percent  in 
error),  and  the  general  agreement  between  the  observations  and 
quasi -geostrophic  linear  dynamics. 


Thesis  Supervisor:  Dr.  Yves  J.F.  Desaubies 

Associate  Scientist,  Woods  Hole  Oceanographic 
Institution,  Woods  Hole,  MA. 
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CHAPTER  1 


INTRODUCTION 

Over  the  last  two  decades,  several  vigorous  research  programs 
have  been  conducted  by  scientists  to  study  oceanic  mesoscale 
variability.  As  a  consequence,  a  more  detailed  and  realistic 
description  of  the  ocean  circulation  has  been  obtained.  Much  of  the 
knowledge  of  the  variability  has  been  obtained  from  extensive 
experiments  such  as  POLYGON,  1970  (Brekhovskikh  et  al . ,  1971), 
MOUE-O,  1971-1972,  MODE-1,  1973  {MODE  Group,  1978)  and  the  recent 
POLYMODE,  1974-1978  (U.S.  POLYMODE  Organizing  Committee,  1976)  in 
whicn  mul ti -moorings  and  a  variety  of  instruments  were  deployed  to 
observe  the  four-dimensional  fields  of  current  and  density  at 
mid-latitudes  in  the  North  Atlantic.  Today,  it  is  well-known  that 
mesocale  fluctuations  that  are  often  called  'eddies'  are 
energetically  dominant  and  exist  everywhere  in  open  oceans.  Even 
close  to  land,  numerous  observations  of  trapped  mesoscale  motions 
have  also  been  reported  (Longuet-Hi ggins,  1968,  Wunsch,  1972,  and 
Hogg,  1980). 

Besides  being  the  most  dominant  feature  in  the  ocean,  eddies 
interact  with  the  mean  circulation  through  the  processes  of  energy 
cascades  to  larger-scale  flows  (Rhines,  1975)  and  barotropic  and 
baroclinic  instabilities  (Pedlosky,  1979),  and  they  transport  heat 
an  a  salt  effectively  by  their  intense  flow  field.  Therefore,  the 
knowledge  of  eddy  dynamics  is  of  fundamental  interest  to  physical 


oceanographers  in  understanding  the  general  circulation. 

Furthermore,  the  research  is  also  of  great  significance  to 
meteorologists  and  marine  scientists  in  other  disciplines,  since 
ocean  eddies  can  influence  the  long-term  climate  on  earth  through 
air-sea  interaction,  transport  chemicals,  and  relocate  biological 
matter. 

Mesoscale  eddies  are  characterized  by  periods  of  50  to  100  days, 
horizontal  scales  of  order  100  km  and  vertical  scales  comparable  to 
the  depth  of  the  ocean  (Richman  et  al  .,  1977,  and  McWilliams, 

1979).  In  places  where  the  flow  field  is  strong,  for  example  in 
regions  close  to  the  major  currents,  the  fluctuations  are  nonlinear 
turbulent  motions.  However,  it  is  conceivable  that  the  fluctuations 
can  be  wave-like  and  dispersive  in  places  that  are  relative  calm, 
because  the  linearized  equation  of  mesoscale  motion,  that  is  the 
linearized  quasi  -geos troph ic  potential  vorticity  equation,  does 
admit  planetary  wave  solutions  (LeBlond  and  Mysak,  1978,  and 
Pedlosky,  1979).  Furthermore,  the  wave  solution  does  exhibit 
behavior  that  is  consistent  with  some  observations,  for  example, 
westward  phase  propagation. 

Li  terature  on  the  theory  of  planetary  waves  is  abundant,  but 
only  slight  observational  evidence  for  their  existence  in  open 
oceans  exists.  Perhaps,  the  most  striking  evidence  to  date  was 
found  by  McWilliams  and  Robinson  (1974),  and  McWilliams  and  Fiierl 
(1976),  by  fitting  waves  to  the  POLYGON  observations  and  tr* 

MODE -array  data,  respectively .  POLYGON  was  conducted  by  the  USSR  in 
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the  tropical  North  Atlantic  during  the  spring  and  summer  of  1970. 

The  array,  which  centered  at  16*30' N,  33°30'W,  measured  the  eddy 
currents  for  several  months  from  moored  current  meters  and 
hydrography.  The  data  was  analyzed  and  presented  by  Koshlyakov  and 
Grachev  (1973).  They  inferred  that  a  single,  anti -cycl onic  eddy,  a 
few  hundred  kilometers  in  diameter,  traversed  the  array  during  the 
experiment,  and  synthesized  their  observations  in  terms  of  a  moving 
elliptical  cylinder  representing  the  locus  of  maximum  horizontal 
current  at  each  depth.  McWilliams  and  Robinson  (1974)  fitted 
planetary  waves  in  a  two-layer  model  to  the  descriptive  synthesis, 
in  which  the  free  parameters,  that  Is  the  wave  amplitudes  and 
wavenumbers  were  determined  from  the  major  and  minor  axes,  the 
orientation  angle  and  the  maximum  orbital  speed  of  the  ellipse.  It 
was  found  that  the  synoptic  structure  and  propagation  of  the  ellipse 
were  well  matched  by  a  pair  of  baroclinic  waves  with  equal  pressure 
amplitudes.  However,  the  POLYGON  wave  fit  was  highly  subjective  and 
might  not  be  optimal  due  to  the  fact  that  the  number  of  waves  was 
arbitrarily  chosen  and  the  observations  used  were  not  the  actual 
data  themselves.  The  lack  of  actual  data  has  prevented  McWilliams 
and  Robinson  from  making  a  quantitative  assessment  of  the  wave  model  . 

The  Mid -Ocean  Dynamics  Experiments  MODE-O  and  MODE-1  were 
conducted  jointly  by  the  USA  and  UK  in  an  approximately  400  km 
square  region  centered  at  28*N,  69*40'W,  again  in  the  tropical  North 
Atlantic.  MOUE-O  was  a  collection  of  several  pilot  studies  that 
were  carried  out  between  1971  and  1972  to  identify  the  energy  level. 
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and  space  and  time  scales  of  the  mesoscale  motion.  It  was  then 
followed  by  MODE-1,  which  was  a  more  comprehensive  experiment 
designed  to  provide  an  accurate  four-dimensional  mapping  of  a 
mid-ocean  eddy  during  the  spring  of  1973.  Several  combinations  of 
barotropic  and  baroclinic  waves  in  a  continuous  ocean  model  were 
fitted  to  the  MODE-O  and  MODE-1  data  sets  by  McWilliams  and  FI  ierl 
(1976).  Wiile  the  MODE-O  data  set  contained  only  current -meter 
records  having  durations  of  from  1  to  3  months,  the  MODE-1  data  set 
was  much  larger  and  more  uniform  in  space  and  time,  having  a 
duration  of  4  months.  It  also  contained  different  types  of 
observations ,  i.e.  from  current  meters,  moored  temperature  sensors, 
hydrographic  stations  and  float  tracks.  In  the  fitting  process,  the 
free  wave  parameters  were  chosen  optimally  to  minimize  a  quadratic 
error  norm  for  the  differences  between  the  data  and  fit.  While  the 
best  MODE-1  fit  consisted  of  a  pair  of  waves  in  the  barotropic  mode 
and  a  pair  of  waves  in  the  first  baroclinic  mode,  the  best  MODE-O 
fit  consisted  of  a  pair  of  barotropic  waves  only.  Both  MODE  wave 
fits  were  fairly  successful,  having  correlations  of  "0.7  with  the 
data  and  accounting  for  ~l/2  of  the  observed  signal  energies,  I.e. 
~70  percent  of  the  signals  (rms ).  However,  the  MODE-1  fit 
corresponded  to  an  inconsistent  1  Inear  solution:  nonlinear  wave-wave 
interactions  within  the  fit  were  predicted  by  the  weakly  nonlinear 
theory  to  be  strong  but  were  not  found  In  the  data.  Thus,  there 
remains  some  doubt  as  to  whether  planetary  waves  truely  existed 
during  MODE-1,  and  more  fundamentally  perhaps,  whether  planetary 
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wave  propagation  is  a  typical  dynamical  phenomenon  in  that  part  of 
the  ocean. 

The  purpose  of  this  dissertation  is  threefold.  First,  it 
reinvestigates  the  existence  of  planetary  waves  in  the  tropical 
North  Atlantic.  This  time,  the  investigation  is  done  by  trying  to 
detect  the  wave  signals  from  the  acoustic  and  spot  observations  made 
in  the  1981  Ocean  Acoustic  Tomography  Experiment  (The  Ocean 
Tomography  Group,  1982),  and  in  doing  so,  the  wave  dynamics  in  the 
region  which  is  centered  at  26°N,  70 °W  (which  will  be  referred  to  as 
the  tomographic  region)  is  also  investigated.  Second,  it  examines 
the  performance  of  the  acousti c -tomographic  observational  system, 
the  spot -observational  system  and  the  combination  of  the  two 
systems,  as  deployed  in  the  experiment,  in  observing  the  waves  and 
also  in  mapping  the  ocean.  Third,  it  explores  the  possibility  of 
using  acoustic  tomography  to  provide  adequate  large-scale  monitoring 
in  the  absence  of  the  tracking  of  the  motion  of  the  acoustic 
moorings . 

The  investigation  of  the  existence  and  dynamics  of  planetary 
waves  involves  analyzing  the  fits  of  different  but  plausible 
wave-propagation  models  to  different  types  of  observations  of 
sound-speed  or  temperature  perturbations ,  made  by  the  CTD  casts, 
temperature  sensors,  temperature -pressure  recorders  and  the  acoustic 
tomographic  array  deployed  in  the  experiment.  The  hope  is  to  be 
able  to  detect  the  waves  and,  at  the  same  time,  determine  the 
correct  wave  dynamics  in  the  fitting  process  by  comparing  the 
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quality  of  the  di  fferent  wave -model  fits.  Due  to  the  insufficiency 
of  explicit  current  measurements  wh Ich  came  from  only  two  horizontal 
locations,  some  deficiences  will  persist  in  our  investigation.  For 
example,  we  cannot  observe  the  barotropic  waves  and  explore  the 
thermal  wind  relation  between  the  wave -induced  current  and  density 
perturbati  ons. 

The  technique  of  fitting  used  here  is  procedurally  similar  to 
that  used  by  McWilliams  and  Flier!  (1976),  corresponding  to  the 
minimization  of  a  quadratic  error  norm  between  the  data  and  the  wave 
fit,  that  is  a  weighted  sum  of  products  of  residuals.  However,  a 
fundamental  difference  is  that,  while  they  have  defined  their  error 
norm  by  choosing  the  weighting  factors  in  a  subjective  manner  as  to 
give  equal  weighting  to  each  subset  of  data  of  the  same  type,  we 
have  constructed  our  norm  by  adopting  the  idea  of  maximum  likelihood 
from  the  stochastic  framework,  i.e.  the  weighting  factors  are  the 
reciprocals  of  the  noise  variances.  The  appeal  of  using  statistical 
approaches  is  that  the  meaning  of  a  wave  fit  being  the  'optimal  1  or 
'best'  can  be  explicitly  defined  in  terms  of  statistical 
conditions.  Another  difference  is  that  our  fi tting  involves 
acoustic  observations  that  correspond  to  integral  measurements  of 
the  field  in  addition  to  spot  observations. 

We  must  give  credit  to  The  Ocean  Tomography  Group  who  provided 
tne  data.  The  experiment  was  conducted  by  them  primarily  for  the 
testing  of  'Ocean  Acoustic  Tomography'  which  is  a  pure  acoustic 
inverse  scheme  for  monitoring  large-scale  fluctuations  in  ocean 


basins.  The  innovative  idea  of  ocean  tomography  was  first 
introduced  by  Munk  and  Wunsch  (  1979)  and  the  scheme  is  analogous  to 
the  medical  tomographic  procedure  CAT  scan.  A  typical  mid -ocean 
tomographic  system,  as  described  by  Munk  and  Wunsch  and  deployed  in 
the  experiment,  consists  of  a  sparse  horizontal  array  of  moored 
mid-water  acoustic  sources  and  receivers  that  surrounds  a  large  area 
of  the  ocean  under  stuciy,  so  that  by  exploiting  the  properties  of 
sound  propagation  in  the  SOFAR  channel,  such  as  low  attenuation  and 
multipath  arrivals,  the  entire  volume  can  be  remotely  sensed, 
horizontally,  vertically,  and  temporally  with  large-scale  resolution 
by  using  repeated  acoustic  transmissions.  Thus,  through 
mathematical  modeling  of  the  relation  between  oceanic  and  acoustical 
fluctuations,  the  four-dimensional  sound-speed  perturbation  field 
should  be  reconstructable  based  on  the  observed  perturbations  of  the 
multipath  arrival  times  using  inverse  techniques.  Superior  to 
traditional  spot-measurement  techniques,  acoustic  tomography  can 
monitor  a  larger  region  and  provide  a  larger  database  with  fewer 
moorings,  and  its  averaging  (integrating)  process  can  filter  out 
undesirable  small-scale  oceanic  features  automatically. 

Furthermore,  unlike  shipboard  surveys,  it  can  map  the  ocean 
instantly  and  the  mapping  can  be  done  frequently.  These  advantages 
of  cost  effectiveness  and  high  temporal  resolution  are  some  of  the 
appeal  of  acoustic  tomography.  However,  the  acoustical  scheme 
depends  critically  on  the  stability,  identification  and  resolution 
of  multi  paths  over  long  distances.  These  have  been  verified  hv 


Spiesberger  et  al.  (1980)  and  Spindel  and  Spiesberger  (1981)  in 
preliminary  experiments. 

The  1981  experiment  was  the  first  field  test  of  a  full 
tomographic  system  for  mapping  the  ocean  at  mesoscale  resolution. 

In  order  to  evaluate  the  performance  of  the  system,  the  experimental 
region  was  also  measured  with  traditional  techniques  by  The  Ocean 
Tomography  Group  during  the  same  time.  The  idea  was  to  provide  a 
basis  for  comparison.  The  tomographic  system  in  a  linear  form  was 
later  'inverted'  for  the  three-dimensional  sound-speed  perturbation 
fields,  independently  of  time  and  only  with  acoustic  data,  by 
Cornuelle  (1983)  and  Cornuelle  et  al .  (1985).  Because  the  dally 
tomographic  maps  do  compare  favorably  with  the  ship-based  objective 
maps,  they  have  demonstrated  the  practicality  of  acoustic  tomography 
for  mesoscale  monitoring.  Here,  our  principal  objective  Is  to 
investigate  the  existence  and  dynamics  of  pi  anetary  waves; 
therefore,  in  order  to  obtain  the  best  estimate  of  the  wave 
parameters  and  wave  dynamics,  we  have  incorporated  the  spot 
measurements  of  temperature  as  well  as  the  integral  measurements 
(that  is  the  acoustic  travel -time  data)  in  our  estimate. 

The  inversions  of  the  data  performed  in  this  study  are  for  the 
retrieval  of  the  planetary  wave  parameters  and  the  planetary  wave 
field,  and  are  intrinsically  different  from  those  previously  done  by 
Cornuelle  (1983)  and  Cornuelle  et  al  .  (1985).  The  originality  of 
our  Inversions  lies  in  that  they  give  a  ti  me -de  pen  den  t  estimate  of 
the  unknown  field,  the  system  involved  is  nonlinear  with  respect  to 


the  unknown  parameters,  and  contains  both  acoustic  and  traditional 
(spot)  observations.  Speci fi cically,  the  system  is  'inverted'  for 
the  four-dimensional  sound-speed  perturbation  field  subject  to  the 
different  dynamical  constraints  constituted  by  the  plausible  models 
of  wave  propagation.  The  inversions,  therefore,  besides  producing 
maps  of  the  ocean  structure,  also  test  different  wave  dynamics 
against  the  data  for  consistency  and  optimality.  Due  to  the 
nonlinear  nature  of  our  inverse  problem,  standard  linear  techniques 
such  as  Singul  ar  Val ue  Decompositions  and  Gaussian  Eliminations  are 
not  applicable,  so  that  we  use  iterative  descent  minimization 
techniques  to  solve  the  problem. 

In  order  to  observe  the  waves,  the  forward  problem  of  how  the 
observations  of  the  dynamical  field  are  related  to  the  evolution  of 
the  waves  under  different  dynamical  conditions  must  first  be 
resolved.  This  subject  is  pursued  in  Chs.  2  and  3.  In  Ch.  2,  we 
examine  the  theory  of  planetary  waves  by  reviewing  the  literature. 

We  review  the  evolution  of  the  waves  at  mid-latitude,  and  under  the 
possible  effects  of  weak  mean  current,  small  bottom  slope  and  weakly 
nonlinear  wave-wave  interaction.  An  objective  is  to  illustrate  that 
the  space  and  time  behavior  is  constrained  by  the  modal  dispersion 
relationship  and  characterized  by  the  wave  parameters:  wavenumbers, 
wave  amplitudes,  modal  amplitudes  of  the  mean  flow  and  growth 
rates.  In  Ch.  3,  we  develop  the  model  equations  that  relate  the 
data  to  the  wave  parameters  that  characterize  the  wave  and  mean-flow 
induced  sound-speed  perturbation.  We  also  describe  the  filtering 


and  reduction  of  the  data  prior  to  the  inversions.  Furthermore,  we 
present  three  plausible  dynamical  models  of  the  induced  sound-speed 
perturbation ,  which  have  been  fitted  to  the  data  to  estimate  the 
wave  dynamics. 

In  Ch.  4,  we  discuss  the  general  parameter-estimation  or  inverse 
problem.  The  goals  are  to  relate  and  unify  some  commonly  used 
estimation  methods,  deterministic  or  stochastic,  and  to  show  that 
there  is  a  general  estimation  procedure,  common  to  all  the  methods 
considered,  to  obtain  the  optimal  solution.  The  procedure 
corresponds  to  the  minimization  of  an  objective  function  of  a 
weighted  sum  of  products  of  residuals,  that  is  a  quadratic  error 
norm.  We  also  discuss  the  error  variance  of  an  estimate  and  some 
widely  used  numerical  techniques  for  minimization.  We  further 
present  some  simple  measures  of  goodness  of  the  fit  for  appraising 
model  s . 

Using  a  gradient  method  for  minimization  (Fletcher  and  Powell, 
1963),  the  wave  parameters  of  each  of  the  three  plausible  wave 
models  were  estimated.  This  corresponds  to  wave  fitting,  and  in 
order  to  estimate  the  number  of  waves,  a  range  of  one  to  five  waves 
was  assumed  for  each  model  in  the  fitting.  The  results  of  the  wave 
fits  and  the  identification  of  the  optimal  model  and  number  of  waves 
are  described  and  discussed  in  Ch.  5.  Furthermore,  the  error 
variance  of  the  estimated  wave  and  mean-flow  induced  sound-speed 
perturbation,  associated  with  the  error  of  the  optimal  estimate  of 
wave  parameters,  is  analysed. 


In  Ch .  6,  we  first  summarize  the  results  of  the  wave  fits  and 
comment  on  the  dynamics,  linearity  and  stability  of  the  waves 
observed.  We  then  compare  this  wave  fit  with  the  MODE  wave  fits, 
and  from  the  results  of  the  three  wave  fits,  we  make  general 
statements  on  the  wave  dynamics  in  the  area  occupied  by  the 
experiments.  We  also  compare  the  tomographic  inverse  method  of 
Cornuelle  (1983)  and  Cornuelle  et  al.  (1985)  with  our  method,  and 
analyse  the  ability  of  the  acoustic,  spot  and  mixed  observational 
systems  in  observing  the  waves  and  mapping  the  ocean.  We  then  make 
concluding  remarks  on  the  investigation. 

The  motion  of  the  acoustic  moorings,  if  not  tracked,  can  be 
misinterpreted  as  oceanic  fluctuations  in  a  tomographic  inversion. 
However,  for  economical  reasons,  it  is  highly  desirable  to  know 
whether  reliable  acoustic  mapping  of  the  ocean  structure  can  still 
be  generated  without  the  deployment  of  navigational  systems  for 
tracking  mooring  motions,  but  rather  through  parameterization  of  the 
mooring  motions,  as  was  done  by  Cornuelle  (1983).  As  a  secondary 
contribution  by  this  dissertation,  a  study  of  this  engineering 
problem  is  presented  in  Ch.  7. 

In  Ch.  7,  we  derive  bounds  on  the  error  of  the  tomographic 
sound-speed  estimate  in  the  presence  of  untracked  mooring  motions. 

An  important  result  shows  that  the  error  variance  of  the  estimate  is 
practically  invariant  with  the  size  of  mooring  motion  but  is  almost 
always  reaching  the  upper  variance  bound.  The  implication  is  that, 
given  a  priori  information  about  the  field,  the  geometry  of  the 


tomographic  array,  and  the  noise  level  ,  the  upper  bound  can  be 
evaluated  to  give  an  indication  about  whether  it  will  be  necessary 
to  track  the  moorings  before  the  deployment. 

Not  to  bore  the  readers  who  are  experts  on  the  subjects  of 
planetary  waves  and  parameter  estimation,  or  only  interested  in  the 
data-model  relations  and  the  estimation  results,  we  take  this 
opportunity  to  inform  them  to  skip  Chs.  2  and  4  in  their  reading. 
These  two  chapters  contain  only  review  material.  The  literature  on 
the  two  sii>jects  is  vast,  and  our  only  excuse  for  writing  Chs.  2  and 
4  is  to  define  the  mathematical  notation  used  in  this  thesis.  New 
material  and  results  are  contained  in  Chs.  5,  6  and  7,  and  in  part 
of  Ch.  3.  The  acoustic  forward  problem  considered  in  Ch.  3  has 
previously  been  studied  by  Munk  and  Wunsch  (197 9),  Cornuelle  (1983) 
and  many  others,  and  the  reason  for  the  redundancy  here  is  just  to 
make  this  presentation  of  the  forward  problem  a  complete  one.  New 
material  in  Ch.  3  are  the  results  of  the  analyti  cal -mode 
decompositions  of  the  CTD  data,  the  use  of  the  modal  decompositions 
as  a  data  reduction  scheme  and  a  demonstration  of  the  transparency 
of  the  higher  modes  to  acoustic  measurements. 


CHAPTER  2 


MESOSCALE  PERTURBATIONS  AND  WAVE  MOTIONS 

In  the  open  ocean,  the  largest  portion  of  the  total  kinetic 
energy  is  contained  in  the  mesoscale  frequency  band.  Mesoscale 
perturbations  or  eddies  have  characteristic  flow  speeds  of 
centimeters  per  second,  horizontal  length  scales  of  hundreds  of 
kilometers,  vertical  length  scales  comparable  to  the  depth  of  the 
ocean,  typical  oscillation  periods  of  months,  and  westward  phase 
velocities.  Over  nonsteep  and  smooth  bottom  topography,  eddy 
currents  are  basically  horizontal,  the  momentum  balance  is  almost 
geostrophic  and  the  local  dynamics  are  governed  by  the  law  of 
conservation  of  quasi  geostrophic  potential  vorticity. 

Away  from  intense  mean  currents,  lateral  boundaries  and  steep 
bottom  topography,  dispersive  planetary  (or  Rossby)  waves  of  low 
frequencies  and  large  length  scales  can  propagate  due  to  the 
latitudinal  variation  of  the  coriolis  parameter.  These  waves  are 
solutions  of  the  1  inearized  equation  of  the  conservation  of 
quasi  geostrophic  potential  vorticity.  The  linearization  is  valid 
when  the  ratio  of  the  wave  period  to  the  advective  time  is  small 
compared  to  unity.  Under  such  circumstances,  mesoscale  fluctuations 
in  the  flow  field  and  the  density  field  are  direct  consequences  of 
the  propagation  of  planetary  waves;  the  density  fluctuations  are  in 
turn  related  to  temperature  and  sound-speed  perturbations . 


This  chapter  is  intended  to  examine,  by  reviewing  the 
literature,  the  dynamics  of  planetary  waves,  and  the  underlying 
dynamical  and  geometrical  approximations  used  on  the  basic  equations 
of  motions.  Sources  of  reference  are  LeBlond  and  Mysak  (1978)  and 
Pedlosky  (1979)  for  the  scaling  analysis  on  the  basic  equations,  the 
derivation  of  the  quasi  geos  trophic  potential  vorticity  equation  and 
the  general  theory  of  planetary  waves,  Flierl  (197  8)  for  the 
orthonormal  izati on  of  the  quasi  geos  trophic  potential  vorticity 
equation  and  the  derivation  of  the  horizontal-structure  equations 
associated  with  the  normal  modes,  and  Longuet-Hi  ggins  et  al  .  (1967) 
for  the  theory  of  resonant  wave-wave  interactions.  The  mechanisms 
for  wave  generation  and  dissipation  will  not  be  considered,  the 
focus  will  be  on  the  evolution  of  planetary  waves  at  mid-latitude, 
under  the  influnence  of  the  earth's  rotation,  and  under  the  effects 
of  weak  mean  currents,  small  bottom  slopes  and  weakly  nonlinear 
wave-wave  interations.  The  goals  are  to  derive  relations  between 
perturbed  dynamical  variables  and  wave -parameters  such  as 
wave-amplitudes,  wavenumbers  and  wavefrequencies ,  and  most  important 
of  all  ,  to  carefully  study  how  planetary  waves  propagate  and 
interact.  Uur  knowledge  of  mesoscale  variability  can  be  increased 
if  some  dynamical  variables  are  measured  or  remotely  sensed  and  wave 
parameters  are  then  estimated. 


2.1  Governing  Equations  For  Mesoscale  Motions 


2.1.1  Sasic  Laws  Of  Conservation 

The  conservation  1 -.ws  for  an  unforced,  incompressible, 
nondiffusive  (in  both  heat  and  salt)  and  invicid  ocean  model  are 
(LeBlond  and  Mysak,  1978) 


^  -  2uxv  =  -jjyp  +  £.,  (2.1) 

dt  p 


dp  =  0,  (2.2) 

It 

and 

^7.  v_  «  0,  (2.  3) 

where  d/dt  is  the  total  derivative,  all  the  dynamical  variables  are 
functions  of  time  and  space,  v_  is  the  velocity  vector  of  fluid 
particles  relative  to  the  rotating  frame  associated  with  the  earth 
that  has  a  constant  angular  velocity  vector  u  (its  magnitude  is 

C 

u~7.3xl0  rad/s),  p  and  p  are  the  density  of  the  fluid  and  the 
pressure  acting  on  it,  respectively,  and  the  £  vector  is  the 
acceleration  of  gravity  (its  magnitude  is  g~9 .81  m/s  ).  The 


conservation  of  momentum  is  expressed  in  (2.1),  (2.2)  is  a  statement 
reyarding  the  thermodynami c  properties  of  nondi ffusivity  and 
incompressibility,  (2.3)  expresses  continuity  (conservation  of 
volume)  and  is  a  combined  result  of  conservation  of  mass  and  (2.2). 

In  the  static  state  vrfiere  v=0  and  p=pQ  is  a  function  of  depth 
-z  or  the  radial  coordinate  only,  the  hydrostatic  pressure  pQ  is 
related  to  pQ  by 


dp0(z)  =  -  pQ(z)  g.  (2.4) 

We  would  like  to  point  out  that  the  static  state  is  generally 
different  from  the  mean  state,  i.e.  they  would  be  the  same  only  vWien 
there  is  no  mean  motion.  In  a  nonstatic  state  where  the  fluid  has 


motion,  the  pressure  and  density  depart  from  hydrostatics  to  become 
P=Pq+P  '  and  p=pq+p',  and  (2.1)  and  (2.2)  can  be  rewritten  as 


respectively,  where  w  is  the  vertical  or  radial  velocity.  In  (2.5), 
p  is  replaced  by  a  constant  reference  density  p*~l  g/ml  (the 
Boussinesq  approximation)  because  the  variation  of  p  in  both  time 
and  space  is  only  about  one  percent  througout  the  ocean,  hence  the 
replacement  woul  d  insignificantly  alter  the  coriolis  and  inertial 
forces . 

2.1.2  Scalings  And  Approximations 

Scaling  analysis  can  be  employed  to  simplify  the  complicated 
basic  set  of  equations  (2.3),  (2.5)  and  (2.6)  to  a  set  that 
describes  only  mesoscale  motions  at  mid-latitude.  The  method  of 
simplification  which  is  described  in  detail  by  Pedlosky  (1979) 
consists  in,  as  a  first  step,  the  transformation  from  the  spherical 
coordinate  system  to  one  with  x,y  and  z  coordinates  representing  the 
eastward,  northward  and  upward  distances,  repectively,  measured  from 
the  transformed  origin  located  on  the  surface  of  the  ocean,  at  a 
latitude  eQ  where  the  area  under  study  is  centered.  The 
transformation  includes  the  Taylor  expansions  of  the  trigonometric 
functions  of  latitude  o,  which  appear  in  the  equations  because  of 
sphericity,  about  og  in  powers  of  x  and  y.  The  components  of  v^ 
are  now  u,  v  and  w  corresponding  to  the  x,  y  and  z  directions, 
respectively.  As  a  second  step,  the  independent  variables  are 
scaled  and  the  dependent  (dynamical)  variables  are  normalized  so 
that  a  set  of  non  dimensional  equations  is  obtained.  The  scalings 


and  normalizations  are  done  by  using  observed  characteristic 
lengths,  times  and  flow  speeds,  and  also  by  using  observed  or 
estimated  magnitudes  of  w,  p'  and  p'.  The  quantities  used  for  the 
scalings  and  normalizations  are  shown  in  the  second  row  of  table 
(2.1).  At  mid-lati tude,  a  typical  horizontal  length  scale  is  L~100 
km,  a  typical  vertical  length  scale  is  H~1  km  and  a  characteristic 
horizontal  flow  speed  is  U“5  cn^s.  From  continuity,  an  estimate  of 
an  upper  bound  for  the  magnitude  of  w  is  UH/L  and  this  quantity  is 
used  for  its  scale.  It  is  important  to  point  out  the  way  that  p' 
and  p'  are  scaled  is  due  mainly  to  our  perception  that  the  motions 
are  almost  geostrophic  and  hydrostatic. 

Next,  the  scaled  dynamical  variables  are  expanded  as 
perturbation  series  in  powers  of  a  small  parameter  e.  Then 
equations  that  describe  the  temporal  and  spatial  behavior  of  the 
nonvanishing  leading  terms  in  the  expansions  are  sought.  The  small 
parameter  is  the  Rossby  number  and,  approximately,  two  other 
important  small  geometrical  ratios: 

e  =  U/fQL  -  L/R  -  H/L  -  10  _2,  (2.7) 

where  R~6. 36x10^  km  is  the  earth's  radius  and  fQ~10"^  rad/s  is 
the  coriolis  parameter  f=2usino  evaluated  at  oQ.  The  smallness  of 
the  Rossby  nunter  U/fQL  and  the  aspect  ratio  H/L  indicates  that 
the  flow  is  predominantly  geostrophic  and  horizontal  .  The  neglect 
of  higher-order  terms  emphasizes  that  our  interest  is  in  local 


I 


dynamics,  with  the  localization  in  space  explicitly  indicated  by  the 


scaling  or  L  H  L/U  U  UH/L  p*fQUL  p*f  gUL/gH 

normalizing  factor 

order  of  magnitude  U  eUH/L  p*f0UL  p*f0UL/gH 

order  of  magnitude  of  error  eU  e^JH/L  ep*fQUL  e p *f qU L /g H 

in  quasi  geos tro phi  c  solution 


/// 
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2.1.3  Quasi  geos  trophy 


After  collecting  non  dimensional  terms  in  the  equations  with  like 
powers  of  e,  we  find  that  to  the  lowest  order  in  e  (that  is  order 
e°),  the  motion  is  geostrophic  (equations  will  be  put  back  in 
dimensional  forms). 


(u.v)  = 


0*f  n 


(2.8) 


hydrostatic. 


3P  •  „ 

_  =  -  p  9. 


(2.9) 


horizontally  nondi vergent,  and  the  zeroth-order  w  vanishes.  Note 
that  p7p*f0  is  a  geostrophic  (zeroth-order )  stream  function  and 

o 

V5p7p*f0  is  the  geostrophic  (zeroth-order)  relative 
vorticity  as  indicated  by  ( 2.8)  ;\72=  32/3x2+  32/3y2  is 
the  horizontal  taplacian.  Equations  (2.8)  and  (2.9),  in  a  sense, 
are  not  too  interesting  because  they  do  not  provide  any  new 
information  nor  information  regarding  the  evolution  of  the 
perturbations  in  time.  However,  it  is  clear  that  w  is  more 
accurately  of  order  cUH/L,  vrfiich  is  even  smaller  than  the  original 
estimate.  The  precise  order  of  magnitudes  of  the  dependent 
variables  are  summarized  in  the  third  row  of  table  (2.1). 


Although  w  is  very  small  (a  first -order  quantity),  it  must  be 
taken  into  account  in  order  to  study  mesoscale  dynamics.  In  fact. 


by  considering  also  the  first-order  equations  in  e,  it  is  found  that 
changes  in  the  vertical  component  of  the  zeroth-order  absolute 
vorticity  (planetary  plus  relative  vorticities)  along  a  particle's 
path  line  are  produced  solely  by  the  streching  of  vortex  tubes  or 
the  small  divergence  of  the  horizontal  flow  3 v^az: 


%  < -L  v5p'  *f  1  =  fo-- 

u  t  p  T  q  3  Z 

where 

^l=l_+ul_+vl.=i_  +  1  (  -  if  'l_  +  (2.10b) 

dt  a t  3x  ay  3 1  p*f g  ay  ax  ax  ay 

As  a  result  of  the  geometrical  scalings  and  the  neglect  of 
higher-order  terms,  the  vertical  planetary  vorticity  or  the  coriolis 
parmeter  f  in  (2.10a)  is  evaluated  locally  as 

f  =  rQ  ♦  ey,  (2.11) 

O 

where  S=2ucos©0/R  (~2x10‘°  rad/ s/ km)  is  the  latitudinal  gradient 
of  f  evaluated  at  oQ.  It  is  also  obtained  that  w  is  related  to  p' 


(2.13) 


where 

N(z)2  =  -  9  !fOlz)  ; 

7s  3Z 

N(z)  is  the  Brunt-Vaisal  a  frequency  that  characterizes  the  stability 
of  the  water  column  and  is  assumed  to  be  known  from  density 
measurements.  Obviously,  the  vertical  displacement  of  isopycnal 
surfaces  (or  isothermal  surfaces  or  surfaces  of  constant  sound 
speed)  is,  from  (2.12), 


n  -  .  (2.14) 

W  az 

We  would  like  to  add  that  in  collecting  terms  to  like  orders,  we 
have  used  the  fact  that  the  Burger's  number  (m/LfQ)^  is  of 
order  one  since  N^IO"^  (ra d/ s ) 2. 

The  consideration  of  quasi  geos  trophy,  that  is  the  small 
deviation  from  geostrophy  or  the  small  w,  leads  further  to  the 
derivation  of  a  single  equation  for  the  stream  function  in  a  closed 
form  (the  equation  is  obtained  by  conbining  (2.10)  and  (2.12)): 


du  o  fn 

C(Vh  +  2-  i_>P  '3 

<Tt  az  "^2"  az 


dP‘=0. 

Tlx 


(2.15) 


Since  it  is  known  that  the  potential  vorticity  C\/x  v+  2u)' 
is  conserved  following  a  fluid  particle  in  an  incompressible, 
adiabatic  (invicid  and  nondl  ffuslve )  and  unforced  ocean  (the  proof 
can  be  found  in  Leblond  and  fysak,  1978),  it  is  interesting  to  point 
out  that  (2.15)  is  simply  a  statement  of  this  conservation  law  but 
following  from  the  applications  of  the  geostrophic,  hydrostatic, 
geometrical  and  Boussinesq  approximations.  Therefore,  the  governing 
equation  for  mesoscale  motions  is  the  conservation  of 
quasi  geostrophic  potential  vorticity. 

We  have  already  derived  relations  between  p1  and  other  dynamical 
variables.  Unce  (2.15)  is  solved  for  p'  with  the  appropriate 
boundary  conditions,  other  dynamical  variables  are  then  known  from 
(2.8),  (2.9)  and  (2.12).  The  solutions  are  not  exact  but  are 
zeroth-order  approximations  for  p',u,v  and  P',  and  a  first-order 
approximation  for  w,  hence  they  are  accurate  to  within  about  IQOe 
percent,  that  is  about  one  percent.  The  order  of  magnitudes  of  the 
errors  in  the  quasi  geos tro phi  c  solutions  are  summarized  in  the 
fourth  row  of  table  (2.1). 


2.2  Boundary  Conditions 


The  boundary  conditions  are  the  continuity  of  pressure  and 
displacement  across  the  disturbed  ocean  surface  at  z=s(x,y,t),  and 
the  vanishing  of  the  normal  velocity  at  the  rigid  bottom  at 
z=-D+b  (x  ,y )  ;  D  is  the  nominal  depth  of  the  ocean  and  D>>  |s |  and  |b| 
However,  it  is  desirable  to  scale  and  approximate  these  conditions 
so  that  they  can  be  replaced  by  a  simplified  but  consistent  version 
that  applies  to  p'  at  z=0  and  z  =  -D  instead.  Otherwise,  it  would  be 
a  very  difficult  task  to  solve  (2.15).  The  simplifications  will  be 
detailed  in  the  following  sections. 

2. 2.1  The  Surface 

The  exact  conditions  are,  at  z=s, 

Pq  (z)  +p'(x,y,z,t)  =  Pa,  (2.16a 

and 

w  =  ds/dt  .  (2.16b) 

The  atmospheric  pressure  p.  can  be  assumed  constant  as  far  as  the 

a 

ocean  is  concerned,  because  the  magnitude  of  the  variation  of  p. 

a 

is  much  smaller  and  the  length  scale  of  variation  is  much  larger. 


After  substituting  the  Taylor  expansions  of  the  dynamical 
variables  about  z=0  in  powers  of  s  in  (2.16),  and  then  dropping 
nonlinear  terms  in  s,  p'  and  w  (so  that  only  the  largest  terms  are 
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kept),  we  obtain,  at  z=Q, 


p'  ~  pQgs  (2.17a) 


and 


w  ~  d^s/dt  (2.17b) 

with  the  uses  of  (2.4)  and  the  identity  PQ=Pa*  The  above  two 
equations  can  be  combined  to  give, at  z=0. 


w  -  ^  (  pl  )  .  (2.18) 

dt  Pq3 

An  order  of  magnitude  analysis  (by  using  table  (2.1))  shows  that 
the  R.H.S.  of  (2.18)  is  of  order  e(  L^f  g/gH)  (UH/L) ,  but  it  is 
also  of  order  e^(UH/L)  since  l^fg/gH  (estimated  with  the 
typical  values  of  L,  fg,  g  and  H)  is  approximately  equal  to  e.  In 
conclusion,  the  R.H.S.  of  (2.18)  that  introduces  only  a  second-order 
correction  to  w  can  be  consistently  discarded  without  affecting  the 
quasi  geos  trophic  solution.  The  result  is  the  rigid -lid 
approximation,  that  is 


35 

w(x*y,0,t)  =0,  (2.19) 

or  equivalently, 

dH  (  -1  3p‘)  =  0  at  2=0,  (2. 2D) 

dt  pV  32 

as  obtained  by  using  (2.12). 

2.  2.  2  The  Bottom 

The  exact  boundary  condition  at  the  bottom  can  be  written  as 

w=u3h/3x  +  v3b/ay  at  z=-0+b.  (2.21) 

Substitution  of  the  linear  expansions  of  u,  v  and  w  about  z=-D  in  b 
and  dropping  the  nonlinear  terms  in  w  and  b  in  (2.21)  gives 

w  ~  u  3b/3x  +  v  sb/ sy  at  z=-D.  (2.22) 

In  order  for  quasi  geos  trophic  theory,  vtfiich  requires  w  to  be  of 
order  ellh/L,  to  remain  valid,  we  must  restrict  the  magnitudes  of  the 
slopes  to  be  approximately  equal  to  or  smaller  than  eH/L.  On  the 


2 

other  hand,  if  the  magnitudes  of  the  slopes  approach  e  H/L,  we  can 
consistently  set  w=0  at  z=-D  without  affecting  the  solution. 

In  using  (2.8),  (2.12)  and  (2.22),  the  condition  for  p'  can  be 
written  as 


( 

dt 


-1  ap'j 
32 


J(p\b) 


at  z=-D, 


where 

ap'ab  ap'  ab 

J(p',b)  = _ - _ 

ax  ay  ay  ax 


(2.  23a) 


(2.  23b) 


is  the  Jacobian  operation 


2.  3  Normal  Modes 


Ultimately,  we  want  to  solve  the  nonlinear  quasi  geos  trophic 
potential  vorticity  equation  (2.15)  stfcject  to  the  nonlinear 
boundary  conti  tions  in  (2.21)  and  (2.23).  However,  if  the  method  of 
separation  of  variables  is  used  to  solve  the  linearized  problem  in 
the  case  of  a  flat  bottom,  a  set  of  z-dependent  eigenfunctions 
f.  (z),  called  the  normal  modes  for  p‘,  are  found.  They  obey  the 
vertical  (structure)  equation: 


d  (  *0  dfi  )  +  x.f.  =0;  i  =0 , 1 , 2. .  (2.24a) 

dz  ^dz  11 


wi  th 


df.  (0)  _  df.  (-0) 
dz  d? 


=  0;  i=0,l  ,2, 


•  I 


(2.  24b) 


-1/  2 

where  is  the  corresponding  eigenvalue,  x^  is  called 
the  radius  of  deformation  of  the  i  th  mode.  Since  the  f  .  (z ) 1  s 
constitute  a  complete  set  of  orthogonal  functions  of  z,  the  solution 
for  the  nonlinear  problem  can  be  cast  as 


P 


V  p’.  IX  ,y  ,t)f .  (z). 


(  2.25) 


In  view  of  (2.8),  (2.9)  and  (2.14),  we  can  also  write 


(u,v)  =  [ 


Y  u.  (x^ .tjf^  (z)  , 
i 


Y  v-i  (x  >y  »t  )fj  (2  )]  »  ( 2.  26 

i 


p1  =  Y  0  pi<x^.t)f.(2) 

i 

and 

n  =  Y  ni  (x*y*t)toj  (z). 

i 


where  f  /  adf^/dz  and  h^  =Df gf. '  /N^.  Futhermore,  the 
modal -ampl  i  tude  functions  are  related  by 


(2.27 


(2.  28 


(ui  *vi  >  = 


3  X 


(2.  29 


p'i  =  -pygD  (2.30 

and 

n.  =  -p'./p*^.  (2.31 

Because  the  vertical  displacement  n  is  intimately  related  to  the 


commonly  observed  sound  speed  (or  temperature),  it  is  used  here  in 
pi  ace  of  w. 

In  (2.25)  to  (2.28),  the  vertical  structure  of  p',  (u,v),  p1  and 
n  is  decomposed  into  normal  modes.  The  modal  decompositions  can  be 
achieved  by  first  solving  the  Sturm-Liouville  problem  in  (2.24)  for 
the  f.  (z)'s  (the  normal  modes  for  p'  )  and  x^  's  with  a  known 

O 

N  ,  one  then  evaluates  the  f'^(z)' s  andh^fzJ's  with 

f.(z)'s,  accordingly.  On  the  other  hand,  one  can  first  obtain  the 

normal  modes  for  n  by  solving  an  equivalent  eigenvalue  problem: 


i  =0 , 1 , 2, ...» 


wi  th 

ht  (0)  =  hj  ( -D)  =0;  i  =0 , 1 , 2 . 


(2.  32a) 


(2.  32b) 


This  is  done  by  Mooers  (1975)  in  his  investigation  of  1  inear  waves 
and  the  corresponding  sound-speed  perturbations  in  a  flat-bottomed 
ocean  with  no  mean  flow.  Equations  (2.32)  can  be  derived  directly 
from  (2.24). 

The  sound-speed  perturbation  field  <sc(x,y,z,t)  is  created  by  the 
vertical  displacement  of  the  surfaces  of  constant  sound  speed: 


d  d 

fic  =  -n  [  cn(z)  - 


c.(z)  ], 


( 2.  33) 


where  Cq  and  dc^dz  are  the  mean  profiles  of  sound  speed  and  its 
adiabatic  gradient  arising  from  the  adiabatic  expansion  or 
compression  of  a  rising  or  sinking  volume  of  fluid,  respectively. 

The  quantity  in  the  bracket  is  the  potential  gradient  of  sound  speed 
(Platte  et  al.,  197  9).  Unlike  the  case  for  p',  compressibility  must 
be  taken  into  account  in  the  evaluation  of  ac  because  the  adiabatic 
gradient  of  sound  speed  is  not  small  In  comparison  with  its 
potential  gradient  and  adiabatic  gradients  do  not  contribute  to 
fluctuations.  A  modal  representation  of  ac  is 


ac  =  -  ^  n.j  (x  ,y  ,t )  f  (2 )»  (2.34a) 


f 091  (z  )  =  hi  ^  _ [CqCj^xC a(z)].  (2.34b) 

dz 

f0gt  can  be  interpreted  as  the  vertical  anomaly  of  sound  speed 
per  unit  displacement  of  the  ith  mode.  The  buoyancy  frequency 
profile  N(z )  measured  during  the  tomographic  experiment  in  1981  is 
plotted  in  Fig.  3.3,  from  which  the  first  three  baroclinic  normal 
modes  for  p'  (or  (u,v))  are  evaluated  and  plotted  in  Fig.  3.4.  The 
corresponding  normal  modes  for  n  and  ac  are  also  evaluated, 
renormalized  to  have  maxima  of  unity  and  plotted  in  Fig.  3.5  and 
3.6,  respectively. 


The  description  of  the  modal  solution  for  quasi  geos  tro  phi  c 
motions  would  be  incomplete  without  the  horizontal  (structure) 
equations  that  govern  the  modal -ampli  tude  functions  p.j(x,y,t). 
Briefly,  (2.15)  is  multiplied  by  f n  (z )  and  p‘  is  replaced  by  its 
modal  representation  in  (2.25).  Integration  along  z  is  then 
performed  to  eliminate  the  z-dependence  of  the  equation.  This 
elimination  is  accompl  ished  wi  th  the  use  of  the  orthonormality 
con  di  ti  on 


Vzlfn(z)dz  “  stn* 


(2.  35) 


where  «in  is  the  krononeker  delta.  For  more  details  regarding  the 
procedure  for  the  orthonormalization,  one  can  consult  Flierl 
(1978).  The  resulting  equations  are 

c’  ♦  _L-  I  ■  ‘VhVPj] 

at  ax  p  *Q  i  j 

a  f°fn(-D)  Y  J(Pi»b )f .  ( -D)  ;  n=0,l ,2 .  (2.36a) 

'Q  C—i  *  ' 

vrfiere  1 


e,  .  =  f.-  (z  )f  .(z  )f(z  )dz. 

ijn  -q  i  j  n 


(2.  36b) 


In  general,  the  modes  are  coupled  because  chey  interact  with  the 
bottom  and  with  each  other  so  energy  can  leak  from  one  mode  to 
another.  But  in  linear  theory  and  in  the  case  of  a  flat  bottom,  th 
modes  are  decoupled. 


2.  4  A  Mean  State 


Let  us  now  introduce  a  depth-dependent  weak  mean  current  yjz). 
By  "weak",  we  mean 

|v|«  U,  (2.37) 

so  that  7  is  small  enough  to  disallow  dynamical  instabilities.  The 
mean  current  can  also  be  decomposed  into  normal  modes: 


“"fn(z)  ■  £  7"fn‘Z>]’  (2-38> 

n  n 

where  u  and  v  are  the  constant  modal  amplitudes  of  the 
n  n 

eastward  and  northward  mean  currents,  respectively,  in  the  nth 
mode.  In  general,  the  kinetic  energy  of  the  lower  modes  dominates, 
so  that  the  mean  current  can  be  parameterized  by  only  a  few  modal 
amplitudes,  and  only  these  modal  amplitudes  appear  in  the 
horizontal -structure  equations  to  represent  the  effects  of  the  mean 
current  on  wave  propagation.  From  the  geostrophic  relation  we  know 
that  the  associated  mean  variation  of  pressure  is 


P '  *  V  P« 


(2.39a) 


P  n(Xvy)  =  P*fo(^ny+vnx)* 


(2.  39b) 


Of  course  p'  must  satisfy  the  time -independent  quasi  geos  trophic 
potential  vorticity  equation  (2.15),  implying  that  the  mean 
modal -ampl  itude  function  p'n  must  satisfy  the  corresponding 
horizontal  equation  (2.36a).  Note  that  there  are  mean  variations  of 
p,  n  and  «c  as  well  ,  and  a  zonal  mean  current  over  a  flat  bottom  is 
always  a  possible  mean  state. 

Let  us  denote  the  modal  amplitude  function  of  the  fluctuating 
pressure  in  the  nth  mode  by  »  such  that 


P'n  =  P'n(*.y)  +  *n(x,y,t). 


It  follows  that  (2.36a)  can  be  written  as 


(2.40) 


0  1  j 


n  — 0 , 1 , 2, .... 


(2.41) 


In  the  following  analyses  of  wave  propagation,  we  restrict  the 

bottom  slopes  b  =3b/ax  and  b  =3b/ay  to  be  constants.  This  is 
A  y 

the  same  as  requiring  the  nonlinear  terms  in  x  and  y  of  the  Tayl  or 
series  expansion  of  b  about  the  origin  to  be  of  order  e2H/L  in 


distances  of  order  L.  In  addition,  we  require  small  slopes  such  that 
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b  and  bu  <<  eH/L, 

*  J 


for  preventing  the  existence  of  bottom -trapped  waves 


(2.42) 


2.5  Dispersive  Primary  Waves 


Previously,  McWilliams  and  FI  ierl  (  1975)  have  shown  that  over  90 
percent  of  the  kinetic  energy  in  MODE  was  contained  in  two  empirical 
orthogonal  vertical  modes  that  closely  resemble  the  barotropic  and 
the  first  baroclinic  modes  of  Rossby  waves.  Ri  chman  et  al .  (1977) 
have  shown  that  about  90  percent  of  the  potential  energy,  again  in 
MODE,  was  contained  in  the  first  three  baroclinic  modes,  with  65 
percent  of  the  energy  being  contained  in  the  first  mode  alone. 
Moreover,  by  decomposing  the  CTD  profiles  obtained  in  the 
tomographic  experiment  into  the  normal  modes,  we  have  consistently 
found  that  the  potential  energy  of  the  first  mode  dominates  (Ch.  3, 
Sec.  3.3).  Therefore,  without  discarding  the  major  features  of 
mesoscale  perturbations,  we  can  set  tt^O  for  n>l  in  the  horizontal 
equations  (2.36a).  The  equations  consisting  only  of  the  lowest  two 
modes  can  be  written  as 


Lq(itq)  =  -Cq(ttj)  -  [J(  *g,  +  TrVH*l  ^  (2.43a) 

p*fo 

and 


Ll(wl) 


-C1U0)  '  _,£eiiiJ(  *i»VhV  +  J*  "I’VhV 


(2.  43b) 


where 


9  _ o  3  3  3  r\  T  3  3 

L0  =_  Vh^-^O-^Vh-!  (by--bx-)’ 
at  ax  ax  ay  H5  y3x  ay 


(2.43c) 


L!  *  — 1 VI;-1  !>  ■,6_.*<“o^*vo— 1  (Vh_>  i1  *  eni(“ijj*7i— > Vh 

fn  p  a  a 

♦  Uff(-D)U>  -b  )  •  (2. 

n  1  J'  ->  V  *777 


(2.  43d) 


ax  ay 


3  3  f  3  3 

Cn  ■  -b  );  n=0,l,  (2.43e) 


a  x  ?y 


a  x  ay 


are  linear  operators  (note  that  xQ~0).  Before  seeking  the  wave 

solutions  for  ttq  and  irp  we  make  the  following  observations  from 

(2.43):  (1)  modes  are  linearly  coupled  as  denoted  by  Cj  O 

n  m 

because  the  fluid  motions  interact  with  the  mean  current  and  the 

bottom  slopes,  and  (2)  just  like  the  mean  current,  the  current 

associated  with  a  wave  can  advect  the  vorticity  of  other  waves  as 

denoted  by  the  Jacob ians,  hence  creating  nonlinear  effects. 

The  advection  of  vertical  planetary  vorticity  south  to  north, 

which  is  proportional  to  the  largest  term  B3ir„/ax  in  L  ( *  ), 

n  n  n 

is  responsible  for  the  propagations  of  planetary  waves.  Whether  the 
linearization  for  the  wave  motions  is  valid  or  not  depends  on  the 
smallness  of  the  ratio,  v,  of  the  magnitude  of  nonlinear  terms  to 


the  magnitude  of  8  3*n/3x.  Qualitatively,  the  nonlinear  terms  are 

2  2 

of  order  U  p*f^/L  and  63irn/3x  is  of  order  bp^qU.  Thus 
we  obtain,  approximately. 


v  -  U/l/fi. 


(2. 44) 


By  using  the  typical  values  of  U,  L  and  b,  we  obtain  v~0.25.  This 
is  not  a  small  value  when  compared  to  unity  so  that  nonlinear 
effects  could  be  important.  However,  quantitatively,  v  can  be  much 
smaller  depending  on  the  wavenumbers  of  the  interacting  waves.  The 
quantitative  estimation  is  defered  to  Sec.  2.6.1.  Let  us  assume  for 
the  moment  that  v<<1.  By  the  assumptions  of  small  bottom  slopes  and 
weak  mean  current,  we  know  that  the  ratio  of  the  magnitudes  of 
Cn(  irm)  to  Ln(  irn)  is  much  smaller  than  unity,  and  for 
convenience,  let  us  assume  that  this  ratio  is  also  of  order  v  so 
that  we  can  construct  the  solution  for  ir^  as  a  perturbation  series 
of  powers  of  v  such  that 


_  _  JO)  +  (1)  + 
n  ~  n  n 


{ 2. 45) 


with  ir^1 +^/ir^~v.  We  will  call  the  zeroth -order 

solution  ir^  and  the  first  order  correction  ir^  the 
n  n 

primary  and  secondary  perturbations,  respectively. 


In  the  zeroth-order  approximation,  the  horizontal  equations 


2.5.1  Dispersion  And  Phase  Velocity 

Equation  (2.46)  admits  a  free-wave  solution.  The  properties  of 
these  waves  can  be  investigated  using  a  triple  Fourier  transform. 
Let  the  complex  spectrum  (or  the  Fourier  transform)  of  the 
modal-ampl  itude  function  irn(x,y,t)  be  j6n(  kn,l  n,an )  such 
that 


*i0)  (x,y,t)  = 


^n(kn*1n*%)ei(knX+1ny'<Int)dkndlndon-  (2'47) 


The  spectrum  shows  how  the  pressure  in  the  nth  mode  is  distributed 
in  the  wavenumber -frequency  domain,  the  amplitude  of  each  individual 
wave  being  infinitesimal  in  a  continuous  spectrum.  In  the  case  of  a 
discrete  wave  in  the  nth  mode  with  wavenumber  vector  (k  ,1  )  and 
frequency  a,  £n  would  consist  of  two  impulses  with  equal 
amplitudes  located  at  *(k,1,o).  The  area  under  them  is  the 
amplitude  of  the  wave. 

By  Fourier  transforming  (2.46)  and  then  cancell  ing  /$n,  we  find 


that  the  waves  in  the  nth  mode  (n=0,l)  must  satisfy  the  dispersion 
rel  ation 


a  (k  ,1  ,on)  =  0 

n  n  n  n 


(2. 48a) 


vrtiere 


WV'V 


(k^+lf+Xn)(art  +  6a  )+k  (s  +  68n  ), 
n  n  n  n  n  n  n 


6oq  =  ~^uokO+v010^ 


(2. 48b) 


(2.  48b) 


2  2  2  2 

6a1  =  -[k1(uQ+e111  *1  i(v0*ein  kl  1 - V  ^  (  2.  48c ) 


i+ 1  1*1 


6B„  =  f°  ff(-D)(b  -lfl  b  ). 


k  +1  +  x 
K1  ‘  1  X1 


n  -jj-  n 


y  IT  x 


(2. 48d) 


By  rearranging,  we  get 


( on + 6  o  )  = 

n  n 


-k  (b+6 b  ) 
n  n 


kn+1  n+xn 
n  n  n 


( 2.  49) 


*.’< ,  v  s'v  .'*■  v**-  a ..  .  ‘  v”' 


As  expected,  the  mean  current  causes  Doppler  effects  given  by 
6an's,  which  vanish  vrfien  there  is  no  mean  current.  It  is  seen 
that  the  propagation  of  barotropic  waves  are  not  affected  by  mean 
baroclinic  currents,  and  the  contributions  to  Doppler  shifts  from 
mean  baroclinic  currents  to  the  wavefrequencies  are  minor  for 
baroclinic  waves  with  wavelengths  much  longer  than  the  radius  of 
deformation  ,  (that  is  for  waves  with 

kl+1l<<xl)* 

It  is  the  small  latitudinal  variation  of  the  coriolis  parameter 
(or  the  e-effect)  that  allows  the  propagation  of  waves  with 
subinertial  frequencies  by  changing  the  relative  vorticity. 

However,  the  8-effect  on  wavefrequencies  can  be  modified  by  6Bn  in 
the  presence  of  bottom  slopes.  This  is  so  because  the  slopes  modify 
the  vertical  velocity  and  hence  change  the  relative  vorticity  also 
(see  (2.10a)  and  (2.11)).  The  modification  of  frequencies  caused  by 
the  longitudinal  bottom  slope  bx  is  small  when  waves  are 
propagating  zonal ly,  that  is  when  ln/kn<<l.  The  s-effect  is 
enhanced  or  reduced  depending  on  the  direction  of  rising  (or 
falling)  topography  and  the  direction  of  wave  propagation.  Because 
the  energy  of  baroclinic  waves  is  trapped  more  in  the  upper  water 
column  than  that  of  the  barotropic  waves,  baroclinic  waves  are  less 
affected  by  the  slopes  (note  that  I f ,  (  -D) I  <fn(  -D)=l  in  (2.48d)). 


The  phase-velocity  vector  of  a  wave  in  the  nth  mode  is 


o  a 

=  (JU  =  K 


8+58 


60  k  8  +  <SB  6a 

+  n)  ,  -(  n)  _ "-  n].  (2.50) 


k  l 
n  n 


,2+,2+1 


kfc+m 

n  n  n 


]n  k  +1  +x  ^ 
n  n  n 


The  east  component  of  is  almost  always  negative  because  of 
siitall  slopes  and  weak  mean  current  that  usually  imply  s>  |s&nj  anc 

2  2  i 

a/ ( kn+l n  ^ >  i6ar/lcnl  *  Thl's  feature  westward  phase 
propagation  is  generally  observed  in  experiments. 

By  rearranging  (2.49),  we  get 
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(2.51) 


Since  k^  and  ln  are  real  for  propagating  waves,  the  R.  H.S.  of 
(2.51)  must  be  positive  such  that 
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(2.52] 


implying  that  there  is  a  frequency  limit  for  wave  propagation. 

general  ,  the  upper  cutoff  frequency  depends  on  mode  nunter, 

wavenumber,  mean  current,  and  bottom  slope.  Because 
-1/2  -1/2 

Xq  »Xj  ,  the  cutoff  frequencies  for  baroclinic  waves 
are  much  smaller  than  those  of  barotropic  waves. 
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The  representation  of  ir^  by  a  continuous  sum  of  its 
wavenumber-f requency  components  distributed  in  the  three  dimensional 
wavenuntoer-frequency  spectrum  n  *an )  is  adequate  but 

nolonger  necessary  due  to  the  dispersion  relation.  A  full 
description  of  the  fluctuating  field  can  be  provided  just  as  well  by 
the  simplier  two  dimensional  wavenuirter-spectrum  (k„  ,1  ) 
such  that 
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(2.53) 


2.5.2  Narrowband  Processes  And  Group  Velocity 


For  fluctuations  due  to  narrowband  processes  in  the  wave-number 
spectrum,  ( i6n (kn »1  n ) j  contains  pulses  of  finite  width  and 
ir^  can  be  represented  by  a  sum  of  modulated  waves.  With  a 
total  number  of  Nn  pairs  of  pulses  in  |/>n|  and  with  the  ith  pair 
being  located  at  *(kpl- ,1  nl- ),  we  can  write 


(x,y  ,t) 
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where  each  modulating  amplitude  (or  envelope)  api  is  slowly 
varying  in  space  and  time  as  compared  to  its  carrier  which  has  a 
phase  constant  -rni-  and  a  frequency  ani-  that  satisfies  the 
dispersion  relation.  The  slowly  varying  nature  of  ap1-  in  x  and  y 
is  implied  directly  by  the  narrovband  processes  in  the  wavenunber 
spectrum;  the  slowly  varying  nature  in  t  can  be  verified  by 
investigating  the  group  velocity. 

While  the  phases  of  the  carrier  waves  are  propagating  with  the 
phase  velocities,  the  phases  of  their  envelopes  are  propagating  wi  th 
the  corresponding  group  velocities.  The  group-vel oci ty  vector  can 
be  eval  uated  by 
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(2.55) 


The  result  (vrfiich  is  not  shown  here)  is  a  complicated  vector 
function  of  kn  and  ln,  indicating  that  in  general  the  modulating 
envelopes  can  propagate  in  any  direction  and  that  the  group  speeds 
are  much  smaller  than  the  cc 'responding  phase  speeds.  Since  an1- 
is  varying  very  slowly  in  time  and  space,  can  be 
approximated  locally  by  a  sum  of  discrete  waves  with  constant 
amplitudes  ani-,  where  ani  is  equal  to  the  area  under  the  ith 
pair  of  pulses  in  Id  I  . 


2.6  Mode  Couplings  and  Nonlinear  Interactions 


Since  the  coupling  and  nonlinear  terms  in  the  horizontal 
equations  are  not  identically  zero  but  finite,  intermodal  wave 
forcing  and  nonlinear  wave-wave  interactions  must  occur  during  wave 
propagation.  In  order  to  investigate  coupling  and  nonlinear 
effects,  we  must  proceed  to  the  next  order  in  v. 

To  order  v,  we  have 
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It  is  seen  that  the  zeroth-order  solutions  are  now  the 
forcing  mechanisms  for  the  first-order  terms  This 

implies  that  secondary  waves  of  smaller  amplitude  can  be  generated 
by  the  primary  waves  through  their  nonlinear  interactions  and  the 
linear  couplings.  If  some  of  the  forced  (secondary)  waves  are  at 
resonance,  that  is  their  wavenumbers  and  frequencies  also  satisfy 
the  dispersion  relation  (a  secular  effect),  their  amplitudes  will 
not  remain  small  but  will  grow,  and  at  some  time  will  become 
dominant  among  all  the  forced  waves. 


Before  going  into  the  subject  of  forced  and  resonant  waves  in 
more  detail  ,  we  will  first  come  back  to  the  issue  of  whether  the 


effects  of  the  nonl  inear  interactions  of  primary  waves  are  small  or 
not.  The  issue  is  important  because  the  validity  of  the  asymptotic 
solution  constructed  as  a  perturbation  series  in  powers  of  v  depends 
on  the  smallness  of  v. 


2.6.1  Magnitudes  Of  The  Nonlinear  Terms 


2 

It  was  mentioned  earlier  that  \>  is  of  order  U/L  b~0.25  and  is 
not  qualitatively  small.  But  quantitatively,  it  can  be  smaller 
depending  on  the  wavenuntoers  of  the  interacting  primary  waves.  This 
fact  will  be  demonstrated  in  this  section. 

There  are  three  cases  that  we  need  to  consider.  They  are  the 
interactions  between  (1)  two  barotropi c  waves ,  (2)  two  barocl  inic 
waves  and  (3)  one  barotropic  and  one  barocl  inic  wave.  We  do  not 
need  to  consider  cases  for  more  than  two  waves  because  each 
combination  of  two  can  be  considered  seperately.  When  we  say  a 
wave,  it  could  imply  either  one  wave  that  is  associated  with  a 
discrete  (or  narrowband)  spectrum  or  one  infinitesimal  group  of 
waves  centered  at  some  wavenumber  in  a  continuous  spectrum. 

In  cases  (1)  and  (2),  the  only  nonvanishing  nonlinear  term  is 
proportional  to  J(  )  W1  ^  n=0  and  n=1  for 

the  first  and  second  cases,  repectively,  and 


(0) 


=  anl  cosenl 


+  an2cosen2’ 
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where  a„,  is  the  amplitude  and  o-„.=k,.x+l  ,-y-a„^t+T„<  is 
m  r  m  nl  nrni  ni 

the  phase  of  the  i  th  wave;  i=l,2.  Note  that 

aH 1  ( lCfl  1  *1 n ■*  Jd^ndl n  in  the  case  of  a  continuous 

spectrum.  The  nonvanishing  Jacobian  term  can  be  cast  as 
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But  since 
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(2.59) 


(2.58)  becomes,  after  performing  the  Jacobian  operation. 
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It  is  seen  that  the  magnitude  of  the  nonlinear  term  depends  on  the 
difference  of  the  squared  magnitudes  of  the  wavenumber  vectors  and 
the  difference  of  the  directions  of  propagation;  the  smaller  the 
differences  are,  the  smaller  the  nonlinear  effects.  In  the  limit 
vrfien  the  waves  have  either  the  same  wavelength  or  the  same  direction 
of  propagation,  there  cannot  be  any  nonlinear  interactions,  and  the 
waves  will  be  an  exact  solution  to  the  quasi  geos tro phi  c  vorticity 
equation.  From  (2.59),  we  notice  that  a  single  wave  is  always  an 
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and 

Tfj0^  =  a^cose^,  (2.61b) 

the  sum  of  the  nonvanishing  nonlinear  terms  is  proportional  to 
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It  is  found  here  that  the  magnitude  of  the  nonlinear  term  again 

depends  on  the  difference  of  the  directions  propagation,  and  also 

depends  on  the  difference  of  +1  311(1 

2  2 

(Icqi+1qi).  Similarly,  the  smaller  the  differences  are, 
the  smaller  the  nonlinear  effects.  There  would  not  be  any  nonlinear 
interactions  if  either  the  waves  of  different  modes  were  propagating 
in  the  same  direction  or  the  difference  of  the  squared  magnitudes  of 
the  barotropic  and  the  barocllnic  wavenuntoer  vectors  were  exactly 

V 

In  conclusion.  Inorder  for  the  (asymptotic)  theory  of  weak 
wave-wave  interactions,  vrtil ch  predicts  the  propagation  of  forced 
waves  and  resonant  interactions,  to  be  applicable,  the  wavenumbers 
of  the  primary  waves  must  be  so  arranged  that  they  make  \*<1. 
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2.6.2  Forced  Secondary  Waves 

In  this  and  the  following  sections,  we  will  discuss  only  forced 
and  resonant  waves  that  correspond  to  two  primary  baroclinic  waves. 
The  other  two  cases  can  be  investigated  by  a  similar  procedure; 
their  results  are  summarized  in  tables  (2.2)  and  (2.4)  without 
futher  discussion.  For  the  case  of  more  than  two  primary  waves,  it 
is  obvious  that  the  forced  solutions  due  to  each  primary  wave  In  the 
linear  coupling  terms  and  each  combination  of  two  primary  waves  in 
the  nonlinear  terms  can  be  summed  together  to  give  the  total 
sol  ution. 

Secondary  perturbations  are  driven  by  the  primary  dispersive 
waves.  For  two  existing  primary  baroclinic  waves  such  that 

T^saj^cose^+a^cose^*  (2.63) 

the  governing  equation  (2.56)  for  the  secondary  perturbations  becomes 
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Note  that  beside  secondary  barocl  inic  perturbations,  secondary 
barotropic  perturbations  are  also  possible  due  to  mode  coupling. 
(Note  also  that  mode  coupling  can  modify  the  frequencies  of  the 
primary  waves.)  While  the  forcing  produced  by  linear  coupling  has 
components  that  oscillate  with  the  same  primary  wavenunbers  and 
frequencies,  the  forcings  produced  by  nonlinear  interactions  have 
components  that  oscillate  with  the  sums  and  differences  of  the 
primary  wavenumbers  and  frequencies. 

The  equations  for  the  i/^'s  are  linear  but  nonhomo  gen  eous , 
containing  simple  harmonic  forcing  functions  in  space  and  time; 
therefore  the  steady-state  solutions  have  the  same  harmonic  forms  as 
the  forcing  functions.  By  expecting  a  phase  lead  or  lag  of  90°,  we 
can  write  down  the  solution  as 
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where  1=^11^^12  i  =  l,2  and 

°t)4=0,12*  Wlth  t*ie  use  (2.64),  the  wave  amplitudes  bn^  are 


eval  uated  as 


The  secondary  perturbations  consist  of  forced  waves.  In  this 


case,  there  are  four  forced  barotropic  waves  and  two  forced 
baroclinc  waves.  Their  amplitudes  are  a  factor  v  smaller  than  those 
of  the  primary  waves  except  at  resonance.  Moreover,  they  need 
continuous  forcing  to  exist,  that  is  the  primary  waves  must  be  quite 
permanent  for  the  forced  secondary  waves  to  exist. 

2.6.3  Resonant  Secondary  Waves 


When  a  forced  wave  of  the  nth  mode  with  wavenunbers 
(kn f ,1  nf )  and  frequency  anf  satisfies  the  dispersion  relation 
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(2.67) 


resonance  occurs  and  (2.67)  is  the  resonance  condition.  At 
resonance,  the  expressions  shown  in  the  last  section  for  wave 
amplitudes  are  nolonger  valid  because  the  denominator  is  identically 
zero  and  the  resonant  wave  amplitude  is  growing  linearly  in  time. 

The  two  forced  barotropic  waves  with  phases  and  o12 
cannot  be  resonant  because  the  wavenumbers  and  frequencies  that 
satisfy  the  baroclinic  dispersion  relation  can  never,  at  the  same 
time,  satisfy  the  barotropic  resonance  condition  due  to  the  form  of 
the  dispersion  relation.  However,  the  other  four  with 
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are  possible  resonant  waves.  Suppose  resonance  occurs  at  n=l.  Then 
at  the  sums  of  the  wavenumbers  and  frequencies,  there  will  be  a 
resonant  baroclinic  wave  having  the  form  bj^ft  Jcosfo-^+o^). 

With  the  use  of  (2.64b),  we  find  that  the  growth  rate  of  the 
ampl  itude  is 
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The  growth  rates  evaluated  at  resonance  for  the  other  three 
possibilities  are  shown  in  table  (2.3). 

It  is  interesting  to  point  out  that  the  growth  rates  do  not 
depend  on  the  mean  current  and  the  bottom  topography,  but  are 
proportional  to  the  magnitudes  of  the  corresponding  nonlinear  terms. 
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Table  2.2 


Interaction  Between  2  Oth-Mode  Primary  Waves 

primary  wave  secondary  wave  resonance  growth 

amplitude  phase  amplitude  phase  rate 
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Table  2.3 


Interaction  Between  2  1st -Mode  Primary  Waves 


mode  primary  wave  secondary  wave  resonance  growth 
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Table  2.4 

Interaction  Between  1  Qth-Mode  and  1  Ist-Mode  Primary  Waves 

mode  primary  wave  secondary  wave  resonance  growth 

amplitude  phase  amplitude  phas-!  rate 
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CHAPTER  3 


THE  FORWARD  PROBLEM:  RELATING  OBSERVATIONS  TO  WAVE  PARAMETERS 


While  the  baroclinic  planetary  waves  produce  significant  changes 
in  both  the  horizontal-current  and  vertical-displacement  fields 
(i.e.,  temperature  field),  the  baro tropic  planetary  waves  produce 
only  significant  changes  in  the  horizontal-current  field  and  very 
little  vertical  displacements.  Thus,  the  barocl  ini c  waves  are 
ooservable  through  temperature  measurements  alone  but  the 
observations  of  both  types  of  waves  must  be  accomplished  with 
combined  measurements  of  current  and  temperature. 

In  our  investigation  of  the  existence  and  dynamics  of  planetary 
waves,  we  used  the  different  types  of  temperature  measurements 
obtained  in  the  1981  Ocean  Acoustic  Tomography  Experiment.  Data 
were  provided  by  The  Ocean  Tomography  Group.  Although  current 
measurements  were  also  available,  they  were  not  used  in  the  study. 
The  current  measurements  lack  spatial  resolution  since  current 
meters  were  mounted  on  two  enviromental  moorings  only  (but  we  have 
used  the  temperature  records  from  those  moorings).  Thus,  we  are 
limited  to  the  detection  of  the  baroclinic  waves  only.  Three  types 
of  temperature  measurements  were  made.  They  are  the  in- situ 
profiles,  the  point  measurements,  and  the  integral  measurements 
(i.e.,  the  acoustic  travel  times),  obtained  from  the  CTD  surveys, 
the  moored  temperature  recorders  and  sensors,  and  the  acoustic 
tomographic  array,  respectively. 


In  order  to  extract  information  on  the  baroclinic  waves  from  the 
temperature  measurements,  the  forward  problem  of  how  the  temperature 
field  and  its  measurements  are  affected  by  the  evolution  of  the 
waves  must  first  be  resolved.  This  is  done  in  this  chapter  in 
conjunction  with  the  last  one  (Ch.  2).  In  the  last  chapter,  we 
studied  the  theory  of  planetary  waves  by  reviewing  the  literature. 

We  saw  that  the  space  and  time  behavior  of  the  wave-induced 
perturDations  of  sound  speed  (or  equivalently  of  temperature)  are 
constrained  by  the  modal  dispersion  relationships  and  characterized 
by  the  wave  parameters  such  as  the  wavenunbers,  wave  amplitudes, 
modal  amplitudes  of  the  mean  flow,  etc.  In  this  chapter,  the 
oojective  is  to  develop  the  model  equations  that  relate  the  data  to 
the  unknown  parameters  that  characterize  the  wave-induced 
perturbations  and  mean-flow  induced  variations  of  sound  speed.  In 
Cn.  5,  the  model  equations  are  Inverted  for  the  wave  parameters.  Of 
course,  one  can  use  either  the  perturbations  of  sound  speed  or 
temperature  as  the  observed  dynamical  variable  in  the  model 
equations,  for  the  two  variables  are  intimately  related  and 
empirically  proportional  to  each  other  (Wilson,  1960  and  Medwin, 
1976).  We  prefer  to  use  the  sound-speed  perturbation  sc. 

We  begin  in  this  chapter  by  giving  a  brief  description  of  the 
1 981  Ocean  Tomography  Experiment.  For  a  detailed  description  of  the 
experiment,  the  reader  is  referred  to  the  Ocean  Tomography  Group 
(1982).  Next,  the  empirical  relation  between  temperature  and  6C  and 
the  integral  relation  between  perturbation  of  acoustic  travel  time 


and  6C  are  discussed.  We  also  discuss  the  data  set  actually  being 
used  in  the  model  equations  for  the  parameter  estimations.  The  data 
set  was  obtained  by  filtering  (daily  averaging)  the  point  and 
integral  measurements  and  compressing  the  profile  measurements. 
Finally,  we  present  three  plausible  dynamical  models  for  wave 
propagation  and  develop  the  model  equations.  The  space  and  time 
behavior  of  the  wave-induced  6C  is  constrained  and  characterized 
differently  in  the  di fferent  model s.  By  fitting  the  different 
wave -propagation  models  to  the  data  set,  the  wave  dynamics  are  then 
estimated  in  Ch.  5. 
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3.1  The  Experiment 

In  the  spring  of  1981,  the  Ocean  Tomography  Group  conducted  the 
first  field  test  of  a  full  tomographic  system  in  a  300  km  square 
south-west  of  Bermuda  over  a  period  of  4  months  (Ocean  Tomography 
Group,  1982).  The  goal  was  to  test  the  practicality  of  the 
acoustic  inverse  scheme  of  Munk  and  Wunsch  (1979)  for  monitoring  the 
ocean  interior  at  mesoscale  resolution.  In  order  to  evaluate  the 
performance  of  the  tomographic  system,  the  region  was  also  measured 
by  traditional  techniques  during  the  same  period  so  that  a  basis  for 
comparison  could  be  provided.  The  tomographic  data  was  inverted  by 
Cornuelle  (1983)  andCornuelle  et  al .  (1985)  on  a  daily  basis;  the 
daily  tomographic  maps  he  generated  compare  favorably  with  the 
snip-based  objective  maps.  This  work  demonstrated  the  great 
potential  of  acoustic  tomography  for  adequate  and  effective  large 
scale  monitoring.  Here,  our  chief  goal  is  to  investigate  the 
existence  and  dynamics  of  planetary  waves,  and  in  order  to  make  the 
oest  estimates  of  the  wave  parameters  and  wave  dynamics,  we 
incorporate  all  types  of  temperature  measurements  in  our  inversions. 

The  experimental  square  was  centered  at26°N,  70°w  over  the 
Hatter  as  abyssal  plain  and  just  south  of  the  region  in  which  MODE 
was  conducted.  The  ocean  bottom  here  has  a  nominal  depth  of  5400  m 
and  a  very  small  depth  variation  of  300  m  over  the  square.  The 
tomographic  system  itself  consisted  of  a  horizontal  array  of  4 
sources  and  5  receivers  moored  at  a  nominal  depth  of  2000  m 


surrounding  the  square.  All  the  acoustic  sources  (Si;  i=l,2,3,4) 
were  moored  at  the  western  boundary,  4  of  the  receivers  (Ri; 
i  =  l,2,3,4)  were  moored  at  the  eastern  boundary  and  the  remaining 
receiver  (R5)  was  moored  near  the  northern  boundary  of  the  square. 
Using  the  signal  processing  technique  of  Spindel  (1979),  a  224  Hz 
carrier  modulated  by  a  repetition  of  a  maximal  length  shift  register 
sequence  that  lasted  nearly  3  minutes  was  transmitted  hourly  on 
every  tnird  day  between  each  of  the  source- receiver  pairs,  and 
through  a  form  of  matched  filtering,  the  multipath  travel  times  of 
the  sequence  were  estimated.  Although  the  transmissions  were 
intended  to  last  for  4  months,  more  than  half  of  the  receivers  had 
stopped  recording  data  after  80  days  into  the  experiment  due  to 
failure  of  the  batteries.  The  motions  of  the  acoustic  moorings  were 
tracked  by  bottom-mounted  acoustic  transponders.  The  tracking  was 
needed  to  prevent  the  misinterpretation  of  the  large  changes  in 
travel  times  due  to  mooring  motion  as  changes  due  to  oceanic 
perturbations.  However,  some  of  the  tracking  data  were  missing  and 
hence  mooring  motions  must  also  be  dealt  with  in  the  model 
equations;  that  is  in  addition  to  «c  ,  the  uncertainty  of  the  the 
positions  of  the  sources  and  receivers  must  also  be  modelled. 
(Cornuelle,  1983  contains  a  detailed  discussion  of  how  to  model  the 
mooring  moti  ons. ) 

The  horizontal  geometry  of  the  tomographic  array  is  shown  in 
Fig.  3.1.  Besides  the  9  acoustic  moorings,  2  environmental 
moorings,  denoted  by  El  and  E2  in  Fig  3.1,  were  also  deployed. 


Figure  3.1.  The  horizontal  geometry  of  the  1981  Ocean  Acoustic 
Tomography  Experiment  (from  Cornuelle  etal.,  1985),  showing  4 
source  moorings  (SI,  S2,  S3  and  S4),  5  receiver  moorings  (Rl,  R2, 

R3,  R4  and  R5)  and  2  envlromental  moorings  (El  and  E2) .  The  diagram 
also  shows  the  topography  of  the  experimental  region. 


Current  meters  were  mounted  on  the  environmental  moorings  but  not  on 
the  acoustic  moorings.  A  total  of  32  temperature-pressure  recorders 
and  temperature  sensors  were  distributed  on  the  moorings  and  mounted 
at  different  depths.  However,  most  of  them  were  not  useful  for  our 
purpose  because  they  were  mounted  either  in  shallow  (above  300  m)  or 
deep  (below  1600  m)  water,  where  information  on  the  lowest 
barocl  ini c -mode  planetary  waves  is  hardly  obtainable.  While  the 
temperature  field  in  the  upper  layers  cannot  be  described  by  the 
lower  modes  alone  and  contains  strong  higher-mode  perturbations,  the 
data  obtained  in  the  deep  zone  contain  little  wave  signal  (i.e. 
shows  very  little  variation). 

Three  CTD  surveys  in  March,  May  and  July  and  2  AXBT  surveys  in 
April  and  June  were  conducted.  Each  CTD  survey  lasted  2.5  weeks  and 
had  t>b  casts  distributed  evenly  over  most  of  the  square,  but  denser 
in  the  middle.  Each  AXBT  survey  had  drops  distributed  at  the  same 
locations  as  the  CTD  stations,  but  such  drops  are  limited  to 
surveying  the  upper  layer  of  the  ocean  only,  and  thus  are  not  useful 
for  our  purpose. 


3.2  Observations  of  sound  speed  perturbations. 

3.2.1  Profile  and  Point  Measurements 

The  speed  of  sound  in  water,  c,  is  given  by  the  square  root  of 
the  ratio  of  the  adiabatic  compressibility  and  density  (a  derivation 
of  the  relation  can  be  found  in  Clay  and  Medwin,  1977).  As  the 
adiabatic  compressibil  ity  and  density  depend  on  temperature  T, 
salinity  S  and  pressure  p  (or  depth  -z)  so  does  c.  Empirical 
formulae  for  the  determination  of  c  from  T,  S  and  p  or  z  have  been 
generated  by  oceanographic  acousticians  using  regression  techniques 
and  polynomial  least  square  fittings  of  laboratory  velocimeter 
measurements  of  sea  water  sound  speed  over  large  ranges  of  S,  T  and 
p.  Some  of  the  well-known  and  highly  accurate  formulae  are  those  of 
Wilson  (1960),  Medwin  (1975)  and  Lovett  (1978);  they  give  almost 
identical  results  for  the  sound  speed. 

The  empirical  formulae  make  it  possible  to  relate  CTD  surveys  to 
the  observations  of  <sc  profiles.  We  prefer  the  formula  of  Medwin 
(1975)  for  its  simplicity;  it  is  given  by 

c  =  1449. 2+4. 6T-0.055T2+0.00029T3 

+( 1.34-0. 010T) (S-35) -0.0l6z,  (3.1) 

where  the  physical  dimensions  of  c,  T,  S  and  z  in  the  equation  are 
n^s,°C,  parts  per  thousand  and  m,  respectively.  CTD  casts  can  be 
converted  to  sonnd-speed  profiles  by  (3.1),  and  a  mean  profile  F(z) 


can  be  estimated  by  averaging  all  the  profiles.  Thus,  for  each  CTD 
cast,  a  profile  measurement  of  sound-speed  perturbation  can  simply 
be  obtained  by 

6c(z)  =  c(z)  -  c(z).  (3.2) 

A  mean  temperature  profile  T(z)  can  be  estimated  by  averaging 
all  tne  surveyed  temperature  profiles.  By  varying  c  with  respect  to 
T  in  (3.1)  and  neglecting  the  salinsity  effects,  we  obtain  the 
empirical  relation  between  6C  and  temperature  perturbation  «T=T-T, 
that  is 

6C  =  4.66T-0.11T6T+0.00087T2«T.  (3.3) 

Using  (3.3),  time  series  of  the  sound-speed  perturbation  can  be 
obtained  from  moored  time  records  of  temperature. 

We  have  converted  all  the  CTD  profiles  and  temperature  time 
records  measured  in  the  experiment  to  profiles  and  time  series  of 
6C ,  using  (3.1),  (3.2)  and  (3.3),  respectively. 


3.2.2  Integral  Measurements 


The  description  of  the  acoustic  field  in  a  moving  medium  by  an 
approximate  solution  using  geometrical  optics  is  valid  vrfien  the 
changes  in  pressure,  density  and  entropy  of  the  medium  are  small 
over  the  wavelengths  of  the  sound  being  transmitted  (Blokhintsev, 
1956).  Such  a  description  is  known  as  ray  acoustics  and  is 
appropriate  for  the  case  of  underwater  sound  transmissions  in  deep 
water  at  relatively  high  acoustic  frequencies,  of  order  200  Hz  and 
higher.  (A  frequency  of  224  Hz  was  used  in  the  tomographic 
system.)  The  ray  solution,  that  is  the  geometrical  optics 
approximation,  for  the  acoustic  pressure  at  a  frequency  ua  can  be 
cast  as 


p  (x,t)  =  A(x  )eiua^c*®^-t^  ,  (3.4) 

d  — 

where  c*  is  an  arbitrary  constant  reference  sound  speed,  and  A  is 
the  amplitude  and  <*>ac*e  is  the  phase  of  the  time-independent 
component  of  pfl.  Blockhintsev  (1956)  has  presented  a  detailed 
derivation  of  the  differential  equations  that  govern  A  and  e.  The 
equation  for  o  is  commonly  known  as  the  eikonal  equation,  relating  o 
to  the  perturbed  sound-speed  field  c+<sc  and  the  flow  field  v  during 
a  transmission  by 


1 V®!2  -  (c*-v  •  V«>2/<c+«c)2. 


(3.5) 
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In  general,  sc  and  v  vary  In  both  space  and  time,  but  they  are 
considered  as  time  invariant  in  the  derivation  of  (3.5),  because 
they  vary  on  a  time  scale  which  is  usually  much  longer  than  the 
duration  of  a  transmission  so  that  the  ocean  can  be  assumed  to  be 
"frozen"  momentarily.  We  will  not  concern  ourselves  with  the 
equation  for  A,  which  is  known  as  the  transport  equation,  because  A 
is  not  directly  related  to  the  travel-time  data.  However,  it  is 
worth  mentioning  that  the  solution  for  A  is  important  for  the 
identification  of  multipath  arrivals.  The  interested  reader  should 
consult  Spiesberger  et  al .  (1980)  for  ray  identifications. 

Ugincius  (1970)  solved  the  eikonal  equation  using  the  method  of 
cn aracteristi cs  and,  by  proving  that  the  direction  of  the 
characteristics  and  acoustic  ray  paths  coincide,  he  then  obtained 
the  equation  for  the  ray  paths  from  the  equation  of  the 
characteristi cs.  In  a  slowly  moving  and  almost  stratified  medium 
witn  |v/c|2,  |(dc/dx)/(dc/dz)|  and  [(dc/ dy )/( dc/ cLz )|  being  much 
smaller  than  unity,  the  ray  equation  can  be  approximated  by 


d  [  ^(x+sx)  -  c*^-  ]  =  0,  (3.6) 

^  C+SC  dS  (C+SC  )2 

where  s  is  the  arc  length  along  a  ray  path,  x=x(s)  is  the  nominal 
trajectory  of  the  ray  path  in  the  unperturbed  and  motionless  state 
and  sx=s£(s)  is  the  deviation  from  x(s)  due  to  the  existence  of  sc 

I  i? 

and  v.  For  the  case  of  mesoscale  eddies,  v/c  is  of  order 
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10" ^  and  the  ratio  of  the  horizontal  to  the  vertical  sound-speed 

-5 

gradients  is  of  order  10  ;  thus  the  ratios  are  indeed  much 

smaller  than  unity.  The  approximate  equation  (3.6)  has  the  solution 
of  planar  rays,  that  is  rays  that  start  out  in  a  vertical  plane  will 
always  remain  in  that  plane.  Munk  (1980)  considered  the  effect  of 
horizontal  mesoscale  sound-speed  gradients  on  horizontal  ray 
oending,  but  found  that  the  bending  is  negligible,  with  the  maximum 
deflection  angle  being  smaller  than  the  horizontal  fractional  change 
in  the  sound  speed.  By  definition,  a  ray  path  is  a  direction  of 
transport  of  acoustic  energy,  and  the  direction  is  the  same  as  the 
normal  to  the  wavefront  (i.e.\7®)  only  when  the  medium  is 
motionless.  With  fixed  locations  of  acoustic  source  and  receiver, 
(3.6)  is  an  eigenvalue  problem.  This  implies  that  depending  on  the 
sound-speed  profile,  sound  energy  may  propagate  in  more  than  one 
discrete  direction  before  reaching  the  receiver,  that  is  there  may 
De  many  ray  paths  that  connect  a  source-receiver  pair.  Multipath 
propagation  is  indeed  a  prominent  feature  in  the  mid-ocean  sound 
channel  and  the  feature  is  fully  exploited  by  acoustic  tomography  in 
attaining  vertical  resolution  in  the  estimation  of  the  perturbed 
sound- speed  field. 

It  is  indicated  in  (3.1)  that  as  temperature  or  pressure 
increases,  so  does  sound  speed.  A  consequence  of  the  competition 
between  decreasing  temperature  and  increasing  pressure  with  depth, 
typical  at  mid  latitudes,  is  the  formation  of  a  sound-speed  minimum 
at  a  depth  of  about  1  km.  This  can  be  seen  in  Fig.  3.2a  in  which  an 


average  sound-speed  profile  in  the  tomographic  region  is  plotted. 

The  souna-speed  minimum  (or  "axis")  of  the  sound  channel  traps  some 
sound  energy  within  it.  For  sound  waves  that  progress  forward  in 
either  an  upward  or  downward  direction,  the  increase  of  sound  speed 
tends  to  refract  them  back  to  the  axis.  The  trapped  energy 
propagates  along  numerous  refracted  ray  paths  that  sample  different 
vertical  sections  of  the  water  column  and  collects  information  about 
the  perturbed  sound-speed  field  through  the  accumulated  travel-time 
changes.  (Fig.  3.2b  shows  the  geometry  of  some  of  the  eigen-rays 
that  connected  the  source  S4  and  the  receiver  R3. )  In  a  pulse 
transmission,  the  trapped  energy  in  the  form  of  multipath  arrivals 
can  be  detected  over  a  long  distance  by  a  receiver  being  placed  near 
or  at  the  axis.  Thus  once  the  multipath  arrivals  of  each  of  the 
source -receiver  transmissions  in  a  tomographic  array  are  identified 
and  resolved,  they  can  be  used  alone  or  together  with  other 
measurements  to  estimate  the  perturbed  sound- speed  field. 

For  a  resolved  ray  path  that  connects  a  source- receiver  pair, 
the  time  required  for  a  signal  to  reach  the  receiver  from  the  source 
is  given  by 


t+«t 


C  *  JC  *  , -ids, 

ds 


(3.7) 


where  the  quantity  in  the  bracket  is  often  referred  as  the  ray 
speed,  t  is  the  travel  time  in  the  unperturbed  and  motionless  state 
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Figure  3.2b.  A  ray  diagram,  showing  the  paths  of  3  of  the 
eigen- rays  that  connected  the  source  S4  to  the  receiver  R3. 


and  6 t  is  its  deviation  due  to  ic  and  v.  It  is  seen  that  in  general 
travel  times  are  perturbed  in  a  very  complicated  manner.  Both  the 
sound-speed  perturbations  and  the  currents  can  affect  travel  times 
directly  and  they  can  also  affect  travel  times  indirectly  by 
changing  the  trajectories  of  the  ray  paths. 

The  evaluation  of  it  can  be  simplified.  Hamilton  et  al.  (1980) 
have  shown  that  for  any  stable  ray,  that  is  any  ray  which  exists  in 
the  mean  state  and  does  not  disappear  or  alter  drastically  its 
geometry  in  the  perturbed  state,  and  for  weak  horizontal  variations 
in  c  and  v,  the  contribution  of  ix  to  it  is  of  higher  order  than 
that  contributed  explicitly  by  the  changes  in  the  ray  speed. 
Therefore,  they  concluded  that  the  perturbed  travel  times  may  be 
evaluated  along  the  unperturbed  ray  paths  without  losing  much 
accuracy.  Furthermore,  for  most  oceanic  fluctuations  |ac|>>|v|  and 


hence  v  may  be  neglected  together  with  ^x.  By  further  neglecting 

o 

terms  of  order  (ac/c)  ,  at  may  be  approximated  by 


(3.8) 


In  (3.8),  at  represents  an  integral  measurement  of  ac .  Because  of 
the  averaging  process,  oceanic  fluctuations  of  smaller  scales  are 
automatically  filtered  from  it.  This  is  one  of  the  many  advantages 
of  acoustic  techniques  over  traditional  techniques  of  spot 
measurements. 
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Although  Mercer  and  Booker  (1983)  have  found  conflicting 
evidence  for  the  validity  of  the  assumption  of  travel-time  linearity 
(3.8)  for  the  case  of  a  warm  eddy  at  temperature  changes  greater 
than  1°C,  the  validity  of  (3.8)  for  the  case  of  planetary  waves  and 
ranges  of  300  km  were  confirmed  by  us  through  a  computer 
simulation.  Planetary  waves  that  correspond  to  «c  of  order  5  nv^s 
and  v  of  order  5  cn^s  in  the  upper  ocean  were  simulated;  these 
values  are  typical  of  the  open  ocean.  Perturbed  and  unperturbed  ray 
paths  over  a  distance  of  300  km  that  connects  a  source  point  and  a 
receiving  point  on  the  channel  axis  were  computed  by  solving  (3.6) 
numerically  with  a  fourth-order  range-dependent  ray-tracing 
technique,  developed  by  the  author  using  the  Runge-Kutta  method 
(Acton,  1970)  and  thus  obtaining  high  numerical  accuracy  at  long 
range.  The  travel- time  perturbations  were  then  computed  numerically 
from  both  (3.7)  and  (3.8),  and  comparisons  made.  Results  of  the 
simulated  study  are  summarized  as  follow: 

(1)  Travel-time  perturbations  of  order  30  ms  are  found. 

(2)  Ray  paths  are  practically  unperturbed.  The  vertical  and 
horizontal  changes  of  their  geometries  are  of  orders  50  m  and  1/2 
km,  respectively.  These  changes  are  small  comparing  to  the  scales 
of  the  mesoscale  perturbations.  Furthermore,  negligible  errors  of 
order  3  ms  are  introduced  in  6t  when  the  unperturbed  ray  paths  are 
used . 

(3)  Current  effects  are  negl  igible.  Travel-time  perturbations 
created  by  the  flow  field  are  found  to  be  of  order  2  ms. 
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A1  though  the  total  error  created  by  the  assumption  of  stationary 
ray  patns  and  the  neglect  of  current  effects  can  be  more  than  10 
percent  of  the  signal  ,  the  estimate  of  a  few  unknown  parameters  is 
generally  unaffected  by  the  error  when  a  large  nunber  of  travel-time 
data  are  available. 


3. 3  Data  Used 


•1 

It  is  well-known  that  at  lower  mi d-1  ati tudes  the  mixed  upper 
layer  of  the  water  is  well  separated  from  the  rest  of  the  ocean  by  a 
sharp  seasonal  thermocline  located  at  a  depth  of  about  200  m 
(Pickard  and  Emery,  1382).  This  large  and  sudden  change  in  the 
density  profile,  that  is  the  seasonal  thermocline,  may  be  viewed  in 
Fig.  (3.3)  in  which  we  show  an  average  profile  of  the  buoyancy 
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frequency  N(z)  in  the  tomographic  region  (N  is  proportional  to 
the  density  gradient).  Physically,  the  seasonal  thermocline 
inhibits  significant  exchange  of  energy  between  the  mixed  layer  and 
the  lower  ocean  that  includes  the  main  thermocline  zone 
(approximately  from  a  depth  of  300  m  to  a  depth  of  1500  m)  and  the 
deep  zone  (below  1500  m  depth),  so  although  the  upper  layer  is 
strongly  forced  by  the  atmospheric  disturbances,  the  lower  ocean  can 
be  left  unforced.  Thus,  an  idealized  unforced  ocean  model  may  be 
used  to  describe  the  dynamics  of  planetary  waves  in  the  entire  ocean 
column  except  the  upper  layer. 

For  these  reasons  and  because  the  potential  energy  of  the  waves 
is  well  contained  in  the  main  thermocline  zone,  we  did  not  use  time 
records  of  temperature  and  travel  time  that  contain  information  on 
the  forced  fluctuations  in  the  upper  layer  or  the  unenergetic 
signals  from  the  deep  zone.  That  is,  the  time  series  records  of  <sc 
that  were  obseved  in  the  upper  layer  or  the  deep  zone  and  the 
resolved  ray  paths  that  cycled  into  the  upper  layer  were  not  used. 


(Mote  that  the  moored  temperature  records  have  been  converted  to  fic 
time  series.)  Consequently,  we  have  eliminated  all  but  7  of  the  <sc 
time  series  and  58  of  the  fit  time  series  in  the  estimations.  We 
have  also  checked  the  deep  fic  time  series  (below  1600  m):  they  have 
very  little  variance,  which  is  cons  is  tent  wi  th  the  theory. 

Some  statistical  information  about  the  mesoscale  variability  in 
the  general  area  of  the  tomographic  region  was  available  from 
previous  experiments,  in  particular,  from  MODE.  Such  information 
concerning  time  scales  and  vertical  structures  can  be  very  helpful 
in  the  data  processing  (such  as  filtering)  needed  for  reducing  the 
noise  level  in  the  data  and  the  size  of  the  data  set.  Once  noise 
ana  data  are  adequately  reduced,  more  accurate  and  efficient 
estimates  can  be  obtained.  Mote  that  statistical  information  can 
also  be  used  to  provide  additional  constraints  on  the  solution  of 
the  inverse  problem;  the  accuracy  of  the  estimate  is  generally 
improved  by  their  application  (see  Ch.  4  for  discussions). 

Daily  averaging  corresponding  to  low-pass  filtering  was 
performed  on  the  <5c  and  fit  time  series  so  that  the  noise  produced  by 
"ui  in  teres  ting"  events  such  as  tides  and  internal  waves  is  reduced. 
Furthermore,  data  points  on  every  third  day  and  on  every  ninth  day 
in  the  filtered  time  series  of  <sc  and  fit  respectively  were  retained 
for  the  estimates.  We  have  not  lost  any  useful  information  by  this 
reduction  of  the  data  because  the  time  scale  of  the  mesoscale  motion 
in  the  area  is  known  to  be  of  order  of  100  days  (Richman  et  al., 
1977). 


McWilliams  and  Flierl  (1975)  have  shown  that  over  90  percent  of 

trie  kinetic  energy  in  MODE  was  contained  in  two  empirical  orthogonal 

vertical  modes  that  closely  resemble  the  barotropic  and  the  first 

baroclinic  modes  of  Rossby  waves.  Furthermore,  Richman  et  al . 

(1977)  have  shown  that  about  90  percent  of  the  potential  energy, 

again  in  MODE,  was  contained  in  the  first  three  baroclinic  modes, 

with  65  percent  of  the  energy  being  contained  in  the  first  Mode 

alone.  Thus,  it  is  evident  that  the  vertical  structure  in  the 

region  is  predominantly  composed  of  only  a  few  of  the  lower  modes. 

In  Fig.  3.4,  we  show  the  first  three  baroclinic  modes  of  currents 

(fj(z);  i=l,2,3),  evaluated  numerically  from  (2.24)  using  N(z) 

shown  in  Fig.  3.3  and  normalized  according  to  (2.35);  the 

oarotropic-current  mode  is  constant  through  out  the  water  column  and 

is  not  shown  in  the  figure.  The  three  corresponding 

2  2 

vertical-displacement  modes,  given  by  h j(z)*DfgN  df^/dz 
where  D=5.4  km  is  the  nominal  depth  and  fg=6. 38x10”^  s“*  is 
the  coriolis  parameter  of  the  region,  but  re-normalized  to  have 
maxima  of  unity,  are  shown  in  Fig.  3.5.  Because  of  the  dominance  of 
the  low  modes,  the  5C  profile  data  can  be  largely  reduced,  and  the 
reduction  will  be  discussed  next. 

The  vertical  modes  of  sound-speed  perturbation ,  g..(z)'s,  can 

be  evaluated  by  (2.34b).  But  due  to  the  fact  that  the  potential 

2 

sound-speed  gradient  is  proportional  to  cN  (Flatte  et  al.,  1979), 
(2.34b)  can  be  recast  as 


Figure  3.4.  The  1st  { _ ),  2nd  (  ♦  ♦  ♦)  and  3rd  (0  0  «)  barocllnlc 

modes  of  horizontal  current  In  the  tomographic  region,  normalized  to 
have  depth -averaged  energies  of  unity. 


gi  (z)  a  hj  (z)c(z)N(z)  . 


(3.9) 


In  Fig.  (3.6),  the  g^ ' s  with  i=l,2,3  are  shown.  Here,  we  have 
r  e-  norma  l  ized  the  g^'s  to  have  maxima  of  unity.  Since  the  g^ 's 
constitute  a  complete  set  of  functions,  the  observed  profiles  of  ac 
can  be  decomposed  as 


:j  (z)  =  X  dij  9i(z*;  3-1 .2,3 . 


(3.10) 


where  d. .  represents  the  weight  of  g.  in  the  jth  profile 

*  J  1 

CTD 

ac  •  .  It  can  be  computed  easily  by  using  the  fact  that  the 

J 

Nn^ 's  (or  (cN)-1g^ 's)  are  orthogonal  to  each  other.  We  can 
interpret  d—  as  the  observed  modal  amplitude  of  ac  at  the 
location  and  time  (x,y,t)=(xi,yi,t  •)  of  the  jth  CTD  cast.  In 

J  J  J 

general  ,  an  exact  modal  representation  of  ac^TD  requires  an 

J 

infinite  sum  in  (3.10).  However,  because  of  the  dominance  of  the 
low  mooes,  the  sum  can  be  truncated  after  a  few  terms  without  losing 
any  valuable  information.  In  fact,  quite  to  the  contrary,  the 
quality  of  the  profile  data  is  improved  since  the  truncation  is  a 
filtering  process  in  which  the  more  oscillatory  but  unenergetic 
higner  mooes  are  totally  eliminated.  An  important  consequence  of 
the  truncation  is  that  the  data  of  an  entire  profile  can  ^.e 
effectively  compressed  into  a  few  modal  amplitudes  that  contain 


Figure  3.6.  The  1st  { _ ),  2nd  (♦  ♦  ♦)  and  3rd  (0  0  >)  modes  of 

sound-speed  perturbation  In  the  tomographic  region,  normalized  to 
have  maxima  of  unity. 


Gin 

equivalent  information.  Therefore,  the  huge  set  of  6C.  's  can 

J 

De  replaced  by  a  managable  set  of  d .  -  ‘ s  for  the  lewer  modes  in  the 

*  \J 

parameter  estimations. 

In  order  to  determine  the  number  of  modal  amplitudes  M  required 
CTD 

to  represent  each  «ci  ,  we  made  the  following  calculations 

J 

with  M=  1 , 2 ,3  for  each  of  the  casts: 


PMj  =  t  1  - 


«rD  - 1  <*,* 2  - 


scCTD 

scj 


dz]  x  100  percent 
(3.11) 


CTD 

where  PM.  is  the  percentage  of  variance  in  tcy  generated  by 

the  first  M  baroclinic  modes  alone.  To  avoid  being  misled  by  the 

fluctuations  in  the  upper  layer,  the  integrations  in  (3.11)  were 
performed  from  300  m  down.  Not  unexpectedly,  j  's  of  50  to  90 
percent  were  found  in  all  the  casts.  This  finding  is  consistent 
with  the  result  of  Richman  et  al .  (1977)  in  MODE.  We  have  also 
found  that  the  contributions  of  the  2nd  and  3rd  modes  to  6C  in  the 
tomographic  region  are  minimal:  there  being  less  than  a  5  percent 
increase  in  the  P2  j  1  s  and  P3  j  ’  s  from  the  Pjj's.  As  a  result 
of  the  above  findings,  we  have  retained  only  the  modal  amplitudes  of 
tne  first  mode,  that  is  a°=d^  ,  for  the  parameter  estimates. 

It  is  an  interesting  fact  that  even  if  higher  modes  do  exist  and 


contain  significant  energy,  they  are  quite  transparent  to  the 
travel-time  measurements.  Higher  modes  are  more  oscillatory  over 
the  vertical  column,  so  that  sound  waves  accumulate  many  canceling 


changes  in  their  travel  times  as  they  propagate  up  and  down  the 
ocean  along  the  multi  paths  before  reaching  the  receiver.  To 
demonstrate  this  fact,  we  simulated  three  perturbed  oceans  that  have 
the  same  horizontal  scale  (of  order  100  km)  in  their  sc.  The  1st, 
2nd  and  3rd  oceans  were  perturbed  solely  by  the  1st,  2nd  and  3rd 
modes,  respectively.  Using  the  geometry  of  the  1981  tomographic 
array  and  the  same  58  ray  paths  used  in  the  estimations,  we  computed 
the  corresponding  <st's.  The  rms  values  of  the  simulated  sc  and 
computed  st's  for  each  ocean  are  summarized  in  Table  3.1.  It  is 
seen  that  even  with  unrealistically  large  higher-mode  perturbations, 
the  second  mode  is  already  transparent  to  the  travel-time 
measurements  at  an  experimental  noise  level  of  5  ms. 
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Me  now  summarize  the  data  set  used  in  the  parameter  estimates  in 
Table  3.2.  The  seven  time  series  records  of  6c  were  distributed 
only  at  3  mooring  sites  El,  E2  and  S3,  and  are  thus  expected  to 
mainly  contain  information  on  the  time  behaviour  of  the  perturbed 
field.  In  contrast,  since  the  duration  of  each  CTD  survey  is 
relatively  short  (2.5  weeks)  as  compared  to  the  wave  period  (of 
order  100  days),  they  should  mainly  contain  spatial  information. 
About  three  ray  paths  per  source- receiver  pair  (which  cycle  almost 
the  entire  depth  of  the  main  thermocline  zone)  were  used.  The 
corresponding  time  series  records  of  travel  time  therefore  contain 
information  on  both  the  time  and  space  behaviour  of  the  perturbed 
field.  Only  the  data  obtained  within  the  period  between  yeardays  61 
ana  139  are  used  since  most  of  the  acoustic  Instruments  had  failed 
after  yearday  139  and  the  experiment  started  roughly  on  yearday  61. 
Thus,  the  data  set  contains  information  on  the  mesoscale 
perturbations  that  is  continuous  in  both  time  and  space  in  the  300 
km  square  over  a  period  of  80  days.  The  postion  26°N,  70°W  and  the 
time  yearday  66  are  defined  hereafter  as  the  point  (x,y,t)={150 
km  ,150  km.  Os)  in  the  tomographic  experimental  coordinate  system. 


Table  (3.2) 

Data  used  in  the  parameter  estimations 

data  type  notation  quantity  duration  no.  of  data  source 
_  (year days) 


moaal  amplitudes  ar.  6  5  66-83  65  1st  CTD  sur 

J 

modal  amplitudes  a9  65  120-137  65  2nd  CTD  sur 

J 


65 

66-83 

65 

1st  CTD  survey 

65 

120-137 

65 

2nd  CTD  survey 

0 

jk 

7 

61-139 

7x27 

temperature  sensors 

0 

jk 

58 

61-133 

58x9 

tomographic  array 

at  time  series  atj^  58  61-133  58x9  tomographic  c 

Note  that  j  is  the  index  for  position  and  k  is  the  index  for  time. 


3.4  The  Wave-Induced  Sound-Speed  Perturbations 

Propagating  planetary  waves  can  be  affected  by  a  number  of 
factors,  such  as  the  presence  or  absence  of  a  mean  flow,  a  bottom 
slope  or  resonant  interactions.  Depending  on  which  of  those  effects 
are  important,  the  corresponding  perturbed  field  can  display  very 
different  space-time  characteristics.  Due  to  the  uncertainty  on 
which  of  those  effects  dominate  in  the  real  situation,  different  but 
plausible  dynamical  models  that  place  emphasis  on  different  factors 
and  are  parameterized  by  different  sets  of  wave  parameters  must  be 
tried  in  the  detection  process.  Thus  the  detection  of  planetary 
wave  involves  both  parameter  estimation  and  model  identification. 

For  the  detection  of  baroclinic  waves  in  the  tomographic  region, 

we  have  estimated  the  wave  and  mean- flow  induced  sound-speed 

perturbations  6Cm(x,t;£)  both  from  our  three  plausible 

wave-propagation  models  {labeled  0,  1  and  2)  and  the  data  set.  The 

results  of  the  wave -parameter  estimation  and  the  goodness  of  each 

model  are  presented  and  discussed  in  Ch.  5.  In  this  section,  we 

describe  the  three  models,  their  associated  ac  's  and  the 

m 

corresponding  sets  of  wave  parameters  £. 

The  ocean  bottom  in  the  area  of  the  experiment  is  quite  flat  so 
tnat  minimal  topographic  effects  on  the  wave  dynamics  should  be 
expected.  Thus,  in  all  three  models,  the  modification  of  the 
8-effect  resulting  from  depth  variations  is  excluded.  The  forced 
waves  resulting  from  nonlinear  Interaction  of  the  dispervsive  waves 


are  also  excluded  in  all  the  models.  The  forced  perturbations  are 
of  higner  order  so  that  they  should  be  negligible.  However,  at 
resonance,  the  forced  perturbations  can  grow  in  time  and  hence  can 
become  significant,  thus  the  possibility  of  resonant  propagation  is 
included  in  Model  2.  Only  the  1st  baroclinic  waves  are  modeled 
because  little  energy  in  the  higher  modes  are  found  in  the  CTD 
casts.  Furthermore,  the  waves  are  assumed  to  be  narrow  band  so  that 
locally  we  can  use  a  discrete-wave  representation. 

Model  0  represents  free  propagation  of  linear  Rossby  waves  over 
a  flat  bottom  in  the  absence  of  a  mean  flow.  The  isopycnal  surfaces 
are  displaced  by  the  baroclinic  waves  so  that  the  corresponding 
sound-speed  perturbations  are  given  by 

<scm(— »t  ;E=£w)  ■  6Cw(-»t;Ew)  (3.12) 

wi  th 

W 

«cw  =  g1(z)  Aicos(kix+liy-ait+Y1 ),  (3.13) 

i=l 

wnere  W  is  the  number  of  first  baroclinic  waves  considered  and  A.., 

( ^  -j » 3  i ) »  a-j  and  are  the  amplitude,  wavenumber  vector, 
frequency  and  phase  of  the  1  th  wave,  respectively.  The  wave 
amplitude  A.,  represents  the  maximum  6C  (which  occurs  at  z=-700m) 


100 


induced  by  the  ith  wave  since  g^(z)  has  been  re-normalized  to  have 
a  maximum  of  unity  that  occurs  at  700  m  depth.  The  space-time 


behavior  of  6CW  is  characterized  by  the  wave  parameters  £=(3^  and 
constrained  by  the  modal  dispersion  relationship  of  the  waves  as 
given  in  (2.49)  without  e  modifications  and  Doppler  shifts, 
i  .e.  ,68n=(San=0.  Because  a i  is  constrained  by  (k ^  ,1  ^ )  in 
(2.49),  a j  is  not  a  free  parameter,  so  that  <$cw  is  completely 
determinable  and  can  be  parameterized  by 


Pyi  =  ( ,1  ^  , . . . , Ay  ,k y ,1  y  ,Yy ) . 


(3.14) 


The  possibility  of  the  existence  of  a  mean  flow  is  added  in 
Model  1.  The  structure  of  the  mean  current  is  assumed  to  consist  of 
the  barotropic  and  the  first  baroclinic  modes  only.  This  assumption 
is  probably  a  good  one  because  the  two  modes  are  known  to  contain 
the  greatest  fraction  of  the  kinetic  energy  in  this  general  area 
(McWilliams  and  Flierl,  1975,  and  Sanford,  1975).  In  this  model  the 
isopycnal  surfaces  are  further  tilted  by  the  baroclinic  mean  current 
(the  thermal  wind  relation).  Therefore,  the  corresponding 
sound-speed  perturbations  are  now  represented  by 


<5CnriCw^-.t;Ew»uo  ,v0  ,U1  ,V1  )+,scc(-;  ul,vl»bo^ 


(3.15) 


with  an  additional  time- in  dependent  mean  variation 


u,  y  v.  x 

sc  =  g,(z)  L  bn  -  l(  )  +  U  )],  (3.16) 

c  1  0  r  U  F*  E) 

where  F  is  a  known  constant,  bg  is  a  constant  for  the  shifting  of 

the  zero- reference  of  6cc  from  the  origin  (x,y)  =  (0,0)  to  the 

correct  position,  and  (uq,Vq)  and  (u^.v^)  are  the  modal 
amplitudes  of  the  barotropic  and  baroclinic  mean  currents, 
respectively.  (Note  that  the  overbars  on  the  mean  modal  amplitudes 
have  oeen  dropped  and  F  is  an  adjusting  factor  resulting  from  the 
different  normalizations  of  f^'s  and  g^s;  F=0.157  in  the 
tomographic  region.)  Due  to  the  Doppler  effects,  the  dispersion 
relationship  of  the  waves  changes  from  that  of  Model  0;  therefore, 
so  does  the  space-time  behavior  of  scw.  The  Doppler  shifts  san 
in  (2.49)  now  exist  and  are  constrained  by  the  (k.-.l^'s, 

(Uq.VqJ's  and  (Uj.v^'s  as  given  in  (2.48C).  Thus,  <scm  is 
now  parameterized  by  E=(£w.u0,v0,u1 ,v2 ,bQ) . 

In  Model  2,  the  possibility  of  the  propagation  of  resonant 
secondary  waves  is  further  included.  The  modeling  requires  the 
replacement  of  A ^  by  A^+G^t  in  (3.13)  where  G ^  represents 
the  growth  rate  of  the  ith  wave.  In  general,  G^  is  constrained  by 
the  wavenumber  vectors  and  wave  amplitudes  of  the  interacting 
primary  waves.  However,  since  the  barotropic  mode  is  not  observable 
in  our  data  set  while  resonant  waves  can  be  generated  by  intermodal 


wave-wave  interactions,  can  only  be  left  as  a  free  parameter  in 
the  model.  The  set  of  parameters  are  now  given  by 

£={Ew*gi*  •••»Gw,uo,vo,ui,vi,0o^ 

The  dynamical  assumptions  made  in  each  model  are  summarized  in 

Table  3.3.  In  addition  to  the  correct  propagation  model  and  its 

parameter  values,  the  nuntoer  of  existing  waves  W  is  also  an 

unknown.  Therefore,  its  integer  value  must  also  be  estimated  in  the 

process  of  detection.  The  estimation  of  W  is  achieved  through 

assumption  and  parameter  estimation,  followed  by  model 

iaentifi cation,  with  each  presumed  value  of  W  being  considered  as 

giving  a  different  sub-model  . 


Table  3.3 


The  Dynamical  Hypothesis  Made  In  Each  Wave-Propagati on  Model 

Model  no.  weak  mean  flow  wave- wave  resonant  flat 

oth  mode  ist  mode  interactions  propagations  bottom 


"X"  denotes  the  assumption  is  made 


3.5  The  Model  Equations 


Detection  is  the  extraction  of  the  desired  signal  from  a 
background  of  noise  (or  other  signals)  by  utilizing  estimation 
methods.  In  our  case  the  desired  signals  are  the  sound-speed 
perturbations  sc  induced  by  the  waves  and  the  mean-flow. 

Obviously,  not  all  the  perturbati ons  of  sound  speed  are  caused 
by  the  baroclinic  waves  and  mean  currents.  There  are  many  other 
oceanic  events  such  as  tides,  internal  waves,  turbulence,  etc.  that 
also  perturb  the  sound  speed.  These  other  sound-speed  fluctuations 
tnerefore,  constitute  the  background  noise  of  our  detection  problem 
and  just  like  the  measurement  noise,  they  too  contaminate  the  data 
set.  But  if  the  signal  generated  by  the  planetary  waves  and  mean 
flow  is  dominant  in  the  data,  the  signal  can  be  detected. 

The  contamination  in  the  data  set  caused  by  the  background 
sound- speed  fluctuations  is  referred  to  as  the  model  noise.  The 
model  and  measurement  noise  combine  to  give  the  experimental  noise 
that  accounts  for  all  the  noise  in  the  model  equations  for  the 


modal-amplitude  data  and  the  sc  time  records.  For  the  qth 
modal-ampl i tude  datum  a^=d^  observed  at 


(x^y ,t)  =  (xqVyq,tq)  and  the  kth  datum 


=<$c°(t=tk )  which  is  the  1th  5c  time  record  observed 


at  (x ,y  ,z  ,t)=(xj  Ky1  ,z1  ,tk ),  the  corresponding  model 


equations  can  be  expressed  simply  as 


aq  =  aq(£)  * 


and 


6C 


fk  -  +  Vlk 


where 


a*"  =  «cm(x„,y„,z=-700  m,t  ) 
q  m  q w  q  *  *  q 


(3.17) 
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(3.18) 


(3.19) 


and 

6Clk  =  5Cm(xl  *yl  *Z1  )  (3,20) 

are  the  signals,  and  va  and  are  the  noise  in  a° 
and  6C°k,  respectively. 

The  formulation  of  the  model  equations  for  the  6t  time  series 
requires  some  special  care.  Tne  content  of  the  st  data  is  more 
complicated  than  that  of  the  other  data.  In  addition  to  the 
baroclinic  waves  and  mean  current,  and  the  background  oceanic 
fluctuations  and  the  measurement  errors,  the  relative  motions  and 
the  uncertainty  in  the  nominal  positions  of  the  acoustic  moorings 
also  contribute  to  the  observed  travel-time  perturbations.  In  fact, 
the  latter  two  contributions  were  dominant.  If  one  were  to  model 
these  mooring-position  related  travel-time  perturbations  as  part  of 
the  experimental  noise,  the  at  time  records  would  suffer  a 
vanishingly  small  ratio  of  signal  to  noise.  In  order  to  improve  the 


quality  of  the  at  data,  as  suggested  by  Cornuelle  (1983),  the 
mooring-position  related  travel-time  perturbations  must  also  be' 
modeled  as  signals,  implying  that  the  uncertainty  in  the  mooring 
positions  must  also  be  parameterized  in  the  acoustic  model  equations. 

A  set  of  relative  mooring-displacement  data  was  available  from 
the  acoustic  navigation  systems.  The  tracking  data  had  already  been 
used  to  eliminate  some  of  the  signal  produced  by  the  mooring  motions 
in  the  travel-time  data.  But,  since  the  set  of  tracking  data  is 
neither  error-free  nor  complete  (a  lot  of  data  were  missing),  the 
untracked  or  unknown  horizontal  displacements  together  with  the 
uncertainty  in  the  horizontal  nominal  positions  of  the  moorings  must 
still  oe  parameterized.  Note  that  the  vertical  translations  of  the 
acoustic  sources  and  receivers  were  small  (of  order  50  m)  and 
produced  very  little  travel-time  perturbations  (of  order  1  ms), 
therefore,  they  need  not  be  parameterized. 

Let  us  consider  the  jth  ray  path  connecting  the  mth  source  Sm  to 
the  nth  receiver  Rn.  According  to  Cornuelle  (1983),  the  additional 
time  required  for  the  acoustic  wave  front  to  travel  from  Sm  to  Rn 
along  the  path  due  to  a  small  elongation  aR  (let's  say  of  order  1 
km)  of  the  horizontal  distance  separating  Sm  and  Rn  can  be 
expressed,  to  lowest  order,  as 

it*  =  r  aR,  (3.21) 

J  J 

wnere  r,  is  the  correspondi ng  ray  parameter,  i.e.,  the  cosine  of 

J 


the  launching  (receiving)  angle  divided  by  the  sound  speed  at  the 

source  (receiver);  r.  is  a  conserved  quantity  along  the  ray.  Let 

J 

tn e  unknown  horizontal-displacement  vectors  at  time  t^  and  the 

time-independent  errors  on  the  assumed  nominal  horizontal-postion 

vectors  of  Sm  and  Rn  be  ^sm^k  ^'^Sm^k  ^  an<* 

L6XRn(tk)*6yRn{tk)3'  and  (AXSn’AySm)  and 
(AxRn,AyRn),  respectively.  It  then  follows  that  the 

D 

corresponding  6tj  at  time  t^  is  given  by 

5tj(tki“rjcosAnnUxRr.-ixSni'5xRn(tk>-sxS»(tk):l 


+r.sfnAnnCAyRn-4yant«yRn(tk)-«yRn(tj()] 


(3.22) 


where  is  the  direction  of  the  horizontal  line  of  transmission 
mn 

from  Sm  to  Rn,  measured  in  degrees  (positive  anticlockwise)  with 
respect  to  the  x-axis,  i.e.,  east-axis. 

We  are  now  in  a  position  to  write  down  the  acoustic  model 
equations.  For  the  travel-time  perturbation  6t9^=6t°(t=t|( ) 
observed  from  the  jth  ray  path  at  time  t^,  the  corresponding 
equation  can  be  cast  symbolically  as 


6tjk  =  ^jk^’^Sm’^Sm’^Rn'^Rn’^Sm’^Sm’^Rn’^Rn5  +  vjk 

(3.23] 

where  v^.  represents  the  total  or  the  experimental  noise  in  at*?.  , 


The  signal  st^  can  be  bitten  as  the  sum  of  two  parts  such  that 

**3>  •  **5,  *  (3-24> 

where  st^at^t^)  is  expressed  in  (3.22)  and  atP^  is  the 

signal  induced  by  the  waves  and  mean  flow  which  can  be  expressed  as,  using 

(3.8), 


(3.25) 


witn  x.  denoting  the  unperturbed  trajectory  of  the  jth  ray  path. 
J 


CHAPTER  4 


PARAMETER  ESTIMATION  AND  THE  GENERAL  NONLINEAR  PROBLEM 

For  a  typical  scientific  investigation,  parameter  estimation  (or 
inversion)  and  model  discrimination  are  the  two  crucial  steps  in  the 
process  of  extracting  information  from  data  obtained  in 
experiments.  Of  course,  a  successful  investigation  also  depends 
critically  on  the  understanding  of  the  physical  situation  and  the 
planning  of  the  experiments.  While  the  physical  knowledge  enables 
us  to  develop  plausible  mathematical  models,  relating  the  physical 
parameters  that  characterize  the  physical  situation  to  the 
pre-selected  types  of  observations  (the  forward  problem),  well 
designed  experiments  provide  good  data  which  are  informative  to  the 
investigation.  Readers  interested  in  the  design  of  experiments  are 
refered  to  the  works  of  Box  et  al .  (1959,  1963  and  1967). 

Estimation  theory  plays  a  vital  role  in  making  progress  in 
physical  oceanography.  The  ocean  is  a  very  complicated 
environment.  The  forcing,  initial  conditions,  and  boundary 
conditions  are  uncertain.  The  exact  description  of  the  fluid  motion 
by  mathematical  equations  is  often  very  difficult,  and  even  when 
where  it  is  possible,  the  exact  solution  is  often  intractable. 

Thus,  in  the  theoretical  study  of  an  oceanic  phenomenon,  we  must 
resort  to  assumptions  and  approximations  (idealizations)  that  are 
reasonable  for  the  particular  study.  Different  assumptions  and 
approximations  result  in  different  models,  and  only  after 


experiments  are  conducted  and  estimations  performed,  can  we  then 
compare  models  for  the  confirmation,  rejection  or  revision  of 
hypotheses.  Therefore,  estimation,  which  utilizes  data  observed  in 
experiments,  provides  a  feed  back  loop  in  the  process  of 
understanding  the  ocean. 

This  chapter  considers  the  general  estimation  problem  The 
technique  and  results  of  estimation  specific  to  the  observations  of 
planetary  waves  in  this  study  are  presented  in  chapter  5.  The 
corresponding  forward  problem  has  been  studied  in  chapters  2  and  3. 
In  the  first  part  of  this  chapter,  estimation  methods  developed  from 
pure  stochastic  approaches  as  well  as  those  with  few  probabilistic 
considerations  are  reviewed  and  discussed.  Our  goal  is  to  relate 
and  unify  tnese  methods  by  showing  that  once  the  same  set  of 
information  and  assumptions  concerning  the  solution  and  experimental 
noise  is  consistently  and  analogously  adopted  by  each  individual 
rnetnod,  these  methods  give  the  same  solution.  In  showing  this,  a 
generalized  estimation  procedure  that  computes  this  "optimal" 
solution  common  to  all  the  methods  considered  is  also  established. 
The  implication  is  that  we  can  stop  worrying  about  these  different 
methods  and  just  apply  the  generalized  procedure  to  data,  since  the 
solution  is  independent  of  the  methods  themselves.  The  generalized 
procedure  is  the  minimization  of  the  now  familiar  function  of  a 
weighted  sum  of  products  of  residuals  from  both  the  experimental  and 
"a  priori"  data.  The  second  part  of  this  chapter  reviews  and 
discusses  some  widely  used  minimization  methods  for  computing  the 


solution.  The  last  part  considers  the  errors  in  the  solutions  and 
presents  some  overall  measures  of  goodness  of  a  model  based  on  its 
final  residuals.  Such  measures  of  goodness  are  needed  in  comparing 
model s. 


4.1  The  Genera]  Estimation  Problem 

The  models  for  many  physical  situations  can  be  expressed 
symbolically  as 

y  =  f(x,£) ,  (4.1) 

where  y  is  an  observable  vector  representing  the  signal  produced  by 
a  physical  event,  x  is  a  controllable  vector  of  design- parameters 
defining  the  experimental  conditions  for  observing  y,  £  is  a  vector 
of  physical  parameters,  where  the  value  of  £  is  not  of  our  choosing 
but  rather  characterizes  the  physical  event,  and  f  is  a  vector  of 
functions  (model  equations)  which  express  one's  theory  on  the 
relation  among  quantities;  f  is  a  vector  of  functionals  when  £ 
represents  continuous  functions  in  their  parametric  forms.  Let  us 
define  the  dimensions  of  y,  f,  x  and  £  to  be  mxl,  mxl,  rxl  and  nxl , 
respectively.  Note  that  all  the  vectors  are  column  vectors. 

The  study  of  a  forward  problem,  typically,  consists  of 
identifying  a  relevant  set  of  physical  parameters  and  deriving  an 
appropiate  set  of  model  equations.  The  idea  is  to  be  able  to  model 
the  signal  for  a  given  situation  described  by  x  and  £,  by 
incorporating  all  the  essential  features  of  the  true  process  into  f. 

The  corresponding  inverse  problem  is  the  estimation  of  £,  based 
on  data  obtained  in  an  experiment  of  controlled  x.  The  model 
equations  f  are  considered  to  be  known  from  the  study  of  the  forward 


problem.  In  the  presence  of  additive  random  noise  in  the 
ooservations,  which  is  always  the  case  in  practice,  the  experiment 
can  be  modeled  stochastically  as 


y*  =  f(x,£)  +  v.  (4.2) 

The  data  or  observations,  denoted  as  y*,  contain  the  signal,  but 
unfortunately,  are  contaminated  by  noise  v,  and  y*  and  v  are  both 
(m-dimensional )  vector  random  variables.  The  experimental  noise  v 
includes  both  the  measurement  noise  and  model  error.  Because  the 
data  is  imperfect,  only  approximate  solutions  or  estimates  are 
obtainable.  An  estimation  or  inversion  procedure  acting  on  the  data 
to  give  an  estimate  is  called  an  estimator.  In  general,  different 
estimates  may  or  may  not  be  computed  from  different  estimators, 
given  the  same  data  set.  However,  the  "optimal"  or  the  "best" 
estimate  £*,  that  is  the  unique  solution  for  £,  is  evaluated  from 
the  optimal  estimator  which  is  established  according  to  one's 
criteria  for  the  optimal  estimate.  Consequently,  the  quality  of  £*, 
besides  depending  on  the  quality  of  the  observations  and  the  model, 
depends  on  the  estimator  that  is  employed. 


* 


4.2  Establishing  Stochastic  Estimators 

Due  to  the  randomness  in  y*,  although  £  itself  may  or  may  not 
be  a  vector  random  variable  (a  random  p  corresponds  to  a  random 
process),  the  estimates  are  always  random  in  nature:  For  a  given 
estimator,  different  realizations  of  y*,  or  equivalently,  of  v, 
would  result  in  different  estimates.  In  fact,  the  statistical 
properties  of  the  estimates  depend  on  the  estimator  used  and  the 
statistical  properties  of  v. 

Before  establishing  the  estimator  for  computing  the  optimal 
estimate  £*,  one  must  do  the  following:  (1)  Select  a  desired  set  of 
statistical  criteria  for  the  optimal  estimate,  (2)  collect  all  the 
available  statistical  information  concerning  the  noise  v,  and  (3) 
collect  all  the  prior  information  concerning  the  physical  parameters 

E- 

4.2.1  Criteria  For  The  Optimal  Estimate 

A  reasonable  estimator  should  produce  estimates  which,  on  the 
average,  are  close  to  the  true  value  of  £.  There  are  two  types  of 
error  associated  with  £*:  the  bias  and  the  random  errors,  and  small 
bias  and  small  variance  are  generally  highly  desirable.  (Bias  is 
the  difference  between  the  expected  value  of  the  estimate  and  the 
true  solution. ) 

In  most  cases,  unbiased  estimators  are  hard  to  obtain,  and  even 


if  obtainable,  the  corresponding  estimates  are  usually  unstable  to 
noise,  meaning  that  small  errors  in  the  data  can  be  translated  into 
large  errors  in  the  estimate.  In  fact,  a  small  bias  must  often  be 
introduced,  intentionally,  for  uniqueness  and  for  reducing  the 
variance  of  the  solution  of  an  ill-conditioned  system  (Rust  and 
Burrus,  1972)  (here  "system”  means  system  of  equations  as  expressed 
in  equation  (4.2)).  Thus,  total  lack  of  bias  is  neither  essential 
nor  often  desirable,  because  unbiased  estimates  are  not  error-free 
and  are  sometimes  unstable. 

The  theoretically  attainable  lower  bound  of  variance  is  given  by 
the  Rao-Cramer  theorem  (see  Bard,  1974,  for  the  derivation). 

However,  practically,  the  estimator  associated  with  this  minimum 
variance  bound  (MVB )  can  only  be  established  for  a  few  simple 
systems  such  as  linear  systems.  The  MVB  estimators  in  the  case  of 
linear  systems  can  be  derived  easily  by  the  Gauss-Markov  theorem 
(Liebelt,  1967).  In  many  engineering  applications  of  estimation 
theory,  the  development  of  a  new  estimation  method  is  usually  not 
necessary  or  important,  because  many  of  the  existing  and  commonly 
used  estimators  can  generally  provide  reasonably  accurate  estimates, 
and  in  addition,  the  minimum- variance ,  unbiased  (i .e.  the  most 
ideal)  estimator  is  unattainable  in  most  cases,  anyway.  In  choosing 
a  common  method,  we  have  simply  accepted  the  criteria  for  the 
optimal  estimate  associated  with  the  method. 


4.2.2  Noise  Distribution 


Complete  statistical  knowledge  of  the  vector  random  variable  v, 
that  is  its  joint  probability  distribution  function  (pdf)  is  seldom 
possessed,  and  usually  only  a  little  information  concerning  v  is 
available,  for  example,  its  mean  (vector)  and  covariance  (matrix). 
However,  we  must  somehow  assign  to  v  an  adequate  pdf  because  most 
estimation  methods  demand  it.  If  only  the  mean  and  variance  are 
known,  a  rational  choice  is  the  (multivariate)  Gaussian  (or  normal) 
distribution,  for  reasons  stated  in  the  following  paragraphs: 

(1)  Simplicity:  a  Gaussian  distribution  is  parameterized  by  its 
mean  and  variance  only,  and  the  assumption  of  a  normal  pdf  for  v 
generally  leads  to  the  establishment  of  simple  estimation  procedures 

(2)  Under  some  mild  conditions,  if  v  is  generated  from  a 
summation  or  integration  of  many  random  variables,  whether  normal  or 
not,  v  tends  to  the  normal,  according  to  the  Central  Limit  theorem 
(a  proof  of  the  theorem  can  be  found  in  Drake,  1967). 

(4)  We  do  not  want  the  estimator  to  be  falsely  informed  by 
specifying  more  statistical  information  than  we  actually  know.  In 
information  theory,  Shannon  (1948)  has  derived  a  suitable  measure  of 
the  information  contained  in  a  pdf;  this  measure  is  called  the 
entropy  and  it  is  inversely  proportional  to  the  amount  of 
information.  Without  further  information  beyond  the  mean  and 
variance,  the  Gaussian  distribution  maximizes  the  entropy  (the  proof 
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can  be  found  in  Bard,  1974),  implying  that  the  amount  of  extraneous 
information  is  minimized. 

Henceforth,  v  is  assumed  to  be  normally  distributed  with  zero 
mean  and  a  known  nonsingular  symmetric  covariance  matrix  £v.  In 
many  cases,  the  true  Cy  may  not  be  exactly  known,  but  this  poses 
no  serious  problem  in  the  estimate.  In  general,  a  reasonalbe 
approximation  of  Cy  can  suffice,  because  most  estimators  are  not 
sensitive  to  small  variation  in  £v.  In  addition,  the  parameters 
can  always  be  reestimated  using  a  refined  £v  when  the  noise 
estimate  (i.e.  the  final  residuals)  generated  by  the  estimator 
signifies  that  the  original  specification  is  far  from  being 
correct.  Model  errors,  very  often,  have  nonzero  means,  and  the 
assumption  that  the  means  are  zero  will  result  in  the  generation  of 
bias  error  in  the  estimate.  However,  when  the  model  is  accurate, 
the  bias  will  be  small.  The  generation  of  bias  will  be  further 
discussed  in  Ch.  6,  Sec.  6.4. 

4.2.3  Prior  Information 

Prior  information,  if  available,  can  often  increase  the  accuracy 
of  the  estimate.  In  fact,  for  ill-conditioned  systems,  the  use  of 
prior  information,  which  is  equivalent  to  the  introduction  of  bias 
in  the  case  of  linear  systems,  must  be  insisted  upon  (Jackon,  1979  , 
Rust  ana  Burrus,  1972;  also  see  the  discussion  in  Ch.  6,  Sec.  6.4 


for  reasons  of  stability  and  uniqueness.  The  information  can  come 
from  previous  experiments  or  physical  intuition,  and  it  can 
generally  be  summarized  in  two  forms:  a  priori  probability 
distributions,  and  deterministic  equality  and  inequality  contraints 
for  £. 

A  scientist  usually  has  some  idea  of  the  true  value  of  p  before 
carrying  out  an  experiment.  For  instance,  he  may  know  that  the  true 
£  must  lie  in  a  region  around,  say,  £=£q  .  The  above  information 
can  often  be  expressed  by  inequality  contraints.  On  the  other  hand, 
if  one  is  willing,  the  same  information  can  be  summarized  in  an  a 
priori  pdf  P(£):  The  specification  of  the  a  priori  expectation  by 
£q  and  the  a  priori  covariance  matrix  according  to  the 
boundary  of  the  region  leads  naturally  to  the  assignment  of  an  a 
priori  Gaussian  distribution  for  £,  with  respect  to  information 
theory.  In  what  follows,  we  consider  only  estimation  with  a  priori 
probability  distribution. 


4.3  Statistical  Estimation  Methods 

Maximum  likilehood  (ML)  and  the  mode  of  the  posterior 
distribution  (MPD)  (when  statistical  prior  information  is  available) 
are  representati  ve,  common  estimation  methods.  One  important  reason 
for  their  typicality  is  that  many  other  common  methods  give  the  same 
estimate  when  all  the  random  variables  under  consideration  are 
normal.  Other  reasons  are  their  wide  range  of  utility,  simplicity 
in  applications  and  that  the  estimates  are  generally  easy  to  compute 

The  ML  estimate  (MLE)  is  the  value  of  £  that,  maximizes  the 
likelihood  function  obtained  by  substituting  the  realization  of  y* 
into  P(y*|£),  i.e.,  the  pdf  of  y*,  given  £.  The  reasoning  is  that 
the  MLE  is  associated  with  the  physical  event  which  is  most  likely 
to  produce  the  data  that  we  have  observed.  On  the  other  hand,  the 
MPD  estimate  (MPDE),  as  indicated  by  its  name,  is  just  the  value  of 
£  at  which  the  maximum  of  the  a  posteriori  distribution 
P({>|y*)  occurs;  P(£|y*)  is  the  pdf  which  we  must  assign  to  £  after 
the  experiment  was  conducted,  that  is  the  pdf  of  £,  given  y*. 
Clearly,  the  MPD  method  is  simply  an  extension  of  the  idea  of 
maximum  likelihood  to  accommodate  the  use  of  prior  information. 

Botn  the  MLE  and  MPDE  are  asymptotically  unbiased  (or  consistent) 
and  asymptotically  efficient  (Fisher,  1950),  that  is  the  estimates 
become  unbiased  and  reach  the  MVB  when  the  number  of  observations 
increases  to  infinity,  therefore,  we  would  expect  the  estimates  to 


have  small  Diases  and  small  variances  when  the  set  of  data  is  much 


larger  than  the  set  of  parameters.  Futhermore,  the  estimates  do  not 
depend  too  strongly  on  the  actual  shapes  of  the  distribution 
functions  P(y*if>)  or  P(£jy*),  and  the  tails  of  the  distributions 
have  no  effect  at  all  on  the  estimates. 

The  MPD  method  belongs  to  the  class  of  estimation  methods  which 
uses  Bayes  '  theorem.  Let  us  now  formulate  the  MPD  estimator.  From 
Bayes’  theorem,  we  have  that 


P(£|.y*)  =  P(y*jf))P(j3)/P(y*}. 


(4.3) 


But  since 


P(y*)  = 


P(y*|£)dj3 


(4.4) 


is  not  a  function  of  p,  and  together  with  the  assumption  of  normal 
distributions  such  that 


P(j\>  =  (2.r"y2  detf1/2(Cv)  e-l/2[)!*-f(x,E)]TC/1[i*-f()i.En 

(4.5) 


and 


P(£)  =  (2w)_n/2det"1/2(C  )  e'1/2(E.‘Eo)T-£ 


(4.6) 
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it  follows  that  the  maximum  of  P(£|y*)  is  identical  to  the  minimum 
of  the  function: 


s(£)  =  sd(£)  +  sp(£),  (4.7a) 

where 

sd(£)  *  l/2Ly*-f(x,£)]TC-1Ey*-f(x,£)]  (4.7b) 


and 

sp(£)  .  l/2(£-E0)TC‘1(E-E0)* 


(4.7c) 


The  function  s(£)  is  called  an  objective  function,  which  is  a 
measure  of  the  "lack  of  fit"  between  the  data  and  model  for  a  given 
value  of  £  (Bard,  1974).  We  can  interpret  sd  and  sp  as  the 
constraints  on  £  provided  by  the  data  and  prior  information, 
respectively.  Thus,  the  MPD  estimator  is  the  minimization  of  the 
objective  function  of  equation  (4.7),  and  the  location  of  the 
(least)  minimum  is  the  MPDE.  It  is  a  general  fact  that  almost  any 
estimation  method  can  be  reduced  to  the  minimization  of  an  objective 


function,  as  will  be  shown  below. 

A  few  comments  on  the  minimum  point  £*  of  equation  (4.7),  that 
is  the  MPDE,  are  listed  below: 

(1)  If  prior  information  is  not  available  so  that  C“*=0,  £* 
is  identical  to  the  MLE.  This  can  be  shown  by  observing  that  the 
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minimum  of  is  the  maximum  of  the  likelihood  function  of 
equation  (4.5). 

(2)  Even  when  C‘*/0,  £*  may  be  interpreted  as  the  MLE:  As 
pointed  out  by  Jackson  (1979),  the  a  priori  information  may  be 


incorporated  in  the  system  of  equations  by  treating  £q  as  the  (a 
priori)  data  for  p  with  covariance  matrix  C^.  In  this  way,  there 
are  n  more  equations  in  the  system  and  £*  is  where  the  maximum  of 


the  modified  likelihood  function  occurs,  assuming  v  and  £  are  not 


correlated. 


(3)  If  the  moael  equations  f  are  linear  in  £  (linear  system), 
and  v  and  £  are  uncorrelated,  £*  is  identical  to  the  1  inear  minimum 
variance  (Gauss-Markov )  estimate  (Liebelt,  1967). 

(4)  If  the  data  are  not  enough  to  constrain  £  by  themselves, 
that  is  the  system  is  underdetermined  anchor  ill-conditioned  such 


that  more  than  one  least  minimum  exists  when  minimizing  sd  alone, 
then  additional  constraints  provided  by  the  prior  information 
denoted  by  the  term  Sp  must  be  added  to  impose  uniqueness.  This 
is  always  the  case  when  inverting  functions,  for  £  is  effectively 
infinite  dimensional.  On  the  other  hand,  if  the  system  is 
well-conditioned,  which  may  be  the  case  when  the  data  outnumber  the 


physical  parameters,  then  the  addition  of  Sp  in  the  objective 
function  will  have  little  effect  on  £*. 
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4.3.1  Incorporation  Of  Different  Data  Types 

We  define  an  independent  data  set  as  a  subset  of  the  entire  set 
of  data,  produced  by  the  same  physical  event,  but  measured  with  a 
different  technique,  so  that  the  randomness  of  any  one  subset  is 
statistically  independent  of  the  other  subsets. 

Suppose  y*  is  a  joint  vector  of  k  independent  data  subsets  y.* 
acquired  in  the  experiment 

i 

! 

+  .V  (4.8) 

where  f^,  and  v^  are  respectively  the  vectors  of  model 
equations,  design  parameters  and  random  noise,  corresponding  to  the 
observation  of  y  *.  Since  v^  and  v^  are  uncorrelated,  the  data 
constraint  sd  of  the  objective  function  of  equation  (4.7)  for  the 
optimal  estimate  decomposes  into  a  sum  of  sub-constraints  such  that 

k 

sd  =  1/2  ^  [4-fi(xi,£)]TC:1C4-f.(x.,£)],  (4.9) 

i 

where  is  the  covariance  matrix  of  v...  has  two  important 
functions:  (1)  to  nondimensionalize  the  ith  set  of  data  and 
equations  so  that  data  sets  with  different  physical  units  can  be 
incorporated  together,  and  (2)  to  control  the  relative  effect  of  the 
ith  data  set  on  the  estimate  upon  its  reliability. 


4.3.2  Treatment  of  Erroneous  Design  Parameters 

Optimal  values  of  the  design  parameters  x=x^  are  preselected 
so  as  to  optimize  the  effectiveness  of  an  experiment  that  will  be 
performed.  However,  the  introduction  of  error  in  the  preselected 
value  of  x  can  seldomly  be  avoided  during  the  deployment.  For 
instance,  a  physical  oceanographer  may  want  to  deploy  a  mooring  at  a 
preselected  location,  but  the  imperfection  in  navigation  renders  the 
preselected  position  subject  to  error.  If  the  signal  In  data 
produced  by  the  error  in  x  is  smaller  than  the  noise  level,  the 
error  may  be  of  no  consequence;  otherwise,  minimizing  the  objective 
function  of  equation  (4.7)  will  produce  an  erroneous  result  which 
can  no  longer  be  an  optimal  estimate  of  £. 

This  problem  can  be  dealt  with  by  treating  the  preselected  value 
Xq  of  x  as  the  observation  of  the  true  value  of  x,  and  modifying 
tne  system  of  equations  (4.2)  to 


(4.10) 

where  w  is  the  error  in  Xg.  In  this  system,  the  true  value  of  x 
is  also  treated  as  a  vector  variable  to  be  estimated,  and  there  are 
additional  r  unknown  parameters  and  r  data  points.  Suppose  we  have 
an  idea  of  what  the  bounds  on  w  are,  so  that  we  can  characterize  w 


by  a  normal  distribution  with  a  covariance  matrix  C^.  This  leads 
to  the  minimization  of  the  following  modified  objective  function: 


s(x,p)  =  ^*-f.(x,£)]TC^1[y*-f(x,£)>^(x0-x)TC^1(x0-x) 

+  ^eo-E*  c^Eo-R) 


(4.11) 


We  refer  x^  and  as  the  erroneous  design  data  and  the  a  priori 
data,  respectively.  They  result  in  two  constraining  functions  which 
are  similar  in  form  to  those  given  by  the  experimental  data  y*. 
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4.4  Non-probabil  istic  Estimation  Methods 

There  are  estimation  methods  developed  originally  with  little  or 
no  consideration  of  statistics.  These  methods  do  not  consider 
optimal  statistical  criteria  for  the  estimate,  but  instead,  the 
optimal  criteria  are  selected  in  a  more  deterministic  manner  upon 
physical  intuition  or  sometimes  in  an  ad  hoc  manner.  However,  these 
methods  have  their  analogues  in  the  pure  stochastic  framework  once 
adequate  probability  distributions  are  attached.  For  examples,  the 
primitive  method  of  "weighted  least  squares"  for  estimating  a 
handful  of  numbers  is  related  to  the  ML  method,  and  the  recent 
"variational  method"  of  Provost  (1983)  and  the  classical  "inverse 
metnoas"  of  Backus  and  Gilbert  (1967,  1968  and  1970),  Wiggins 
( 1 y72 ) ,  Jackson  (1972),  Parker  (1977)  and  Wunsch  (1978)  for 
estimating  continuous  functions  are  related  to  the  MPD  method. 

4.4.1  The  Variational  Method 

Provost's  variational  method  translates  the  problem  of 
estimating  continuous  functions  to  a  problem  in  the  calculus  of 
variations.  In  the  simplest  description,  the  estimation  problem 
becomes  the  determination  of  £  that  minimizes  a  nonnegative 
"smoothing"  functional  of  the  unknown  function  represented 
parametrically  by  £,  and  subject  to  the  data  constraint 
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Ly*-f(£)]'w[y*-f(£)]  =  q.  (4.12)  v 

In  the  above,  q  is  an  expected  (or  presumed)  positive  value  of  a 
measure  of  the  total  misfit  between  data  and  model  prediction,  and  W 
is  a  positive-definite,  diagonal  weighting  matrix  for 
non  dimensional  izing  and  scaling  data  and  equations  having  different 
physical  units  and  different  order  of  magnitudes.  Scaling  factors 
associated  with  the  degree  of  reliability  on  each  datum  can  also  be 
included  in  W.  Let's  assume  that  the  unknown  function  is  a  time 
signal  in  this  discussion.  The  smoothing  functional  can  be  the 
integral  of  the  square  curvatures  or  square  slopes,  or  some  other 
desired  nonnegative  measures  of  smoothness  of  the  time  signal,  and 
it  can  De  expressed  parametrically  as  £TS  £  with  the  matrix  S 
oeing  positive  definite.  The  corresponding  objective  function(al) 
to  be  minimized  is  l(£,a)  with 

l(£,a)+aq  =  a[y*-f (x,£)]TW  [y*-f (x,£)]  ♦  £TS  £,  (4.13) 


where  a  is  the  Lagrange  multiplier  to  be  found. 

In  the  variational  method,  a  criterion  for  the  optimal  estimate 
is  the  satisfaction  of  the  data  constraint,  but  since  there  will  be 
so  many  solutions  satisfying  this  constraint  due  to  the 
underdetermined  nature  of  this  system,  another  criterion  must  be 
brought  in  to  ensure  uniqueness,  and  it  is  smoothness.  Clearly,  the 


method  chooses  among  all  the  solutions  of  equation  (4.12)  the 
smoothest  one  to  be  the  optimal  estimate,  with  smoothness  defined  by 
the  selected  smoothing  functional  . 

With  £q=0,  the  similarity  between  equations  (4.13)  and  (4.7) 
is  evident,  and  in  fact,  they  have  the  same  minimum-point  (i.e.,  the 
two  methods  have  identical  solution)  when  S  and  W  are  set 
proportional  to  the  inverse  covariance  matrices  C^1  and 
C-/  of  £  and  v,  respectively.  Under  such  choice  of  S  and  W,  if 
£  represents  the  Fourier  amplitudes  of  the  time  signal  to  be 
estimated,  then  the  diagonal  elements  of  S'1  and  IT1  represent 
the  normalized  power  spectral  density  functions  of  the  signal  and 
noise,  respectively.  Furthermore,  a  is  analogous  to  the  signal  to 
noise  ratio  and  the  deterministic  criterion  of  smoothness  is 
analogous  to  the  a  priori  information  of  a  low-pass  signal  described 
statistically  by  the  spectrum  denoted  by  S'1. 

In  practice,  one  does  not  compute  a  and  £  simultaneously  through 
minimization,  but  instead,  they  are  often  determined  by  an  iterative 
technique:  A  guess  value  for  a  is  used  so  that  £  are  the  only 
variables  during  minimization,  and  after  the  corresponding  solution 
for  £  is  evaluated,  one  then  computes  the  corresponding  q  from 
equation  (4.12),  and  if  the  computed  q  is  acceptablely  close  to  the 
expected  value,  the  optimal  estimate  is  successfully  found, 
otherwise,  the  procedure  is  repeated  as  many  times  as  needed  with 
different  but  progressively  better  guess  values  for  a.  A  similar 
iterative  estimation  procedure  is  also  commonly  exercised  in 


stochastic  methods  because  Cy  and  are  generally  estimates 
themselves,  therefore,  their  values  must  be  adjusted  and  the 
minimization  procedure  must  be  repeated  if  the  final  residuals  do 
not  agree  with  the  presumed  covariances. 


4.4.2  The  Inverse  Methods 


Backus  and  Gilbert  originally  developed  a  general  formalism  for 
solving  the  linear  inverse  problem,  which  was  later  cast  into  simple 
linear  algebra  by  Wiggins,  Jackson,  Parker  and  Wunsch  in 
applications  to  geophysical  and  oceanographic  problems.  Such 
formalism  is,  Dy  now,  known  simply  as  "linear  inverse  methods".  The 
general  linear  inverse  problem  can  be  cast  into  the  parametric  form 

y*  =  L  R  +  1  (4.14) 

by  replacing  f(£)  with  F  £  in  equation  (4.2),  where  F  is  a  mxn 
matrix  representati on  of  the  linear  differential  operator  associated 
with  the  forward  model  and  £  is  a  parametric  representation  of  the 
continuous  function  to  be  estimated.  Again,  since  £  is  effectively 
infinite  dimensional  while  the  number  of  observations  is  limited, 
the  system  is  under  determined,  i.e.,  n>>m.  The  system  is  generally 
ill-conditioned  as  well  ,  so  that  there  are  infinite  number  of 
unstable  solutions  (i.e.  solutions  with  large  error  variance)  that 
satisfy  equation  (4.14)  identically  with  v  set  to  zero.  One  thus 


faces  the  problem  of  nonuniqueness  compounded  with  instability. 

Before  going  any  further,  let  us  first  transform  equation  (4.14) 
to 


/  =  F'£‘  +  v',  (4.15) 

where  y ' =W  17 2y*,  £'=S1/2£,  v'=W1/2v,  and  F'=W1/2F  S_1/2. 

1/2 

The  scaling  by  W  is  necessary  because  some  obsevations  may  be 

1/2 

less  reliable.  The  scaling  by  S  is  also  necessary,  because 
without  this  scaling,  the  large  weighting  coefficients  in  F  would 
tend  to  put  large  amplitudes  to  the  associated  parameters  in  an 
unaerdetermined  system.  Both  S  and  U  are  symmetric  positive 
definite  matrices.  Formally,  the  solution  p/  of  (4.15)  can  be 
expressed  as  a  weighted  sum  of  normalized  orthogonal  vectors  v. 
belonging  to  a  complete  set  such  that 


£' 


I 

j=l 


aj  -J 


(4.16) 


The  a-'s  are  the  unknown  coefficients  which  we  hope  to  determine 

J 

from  the  data  and  from  some  criteria  in  the  inverse  problem. 
Choosing  the  right  set  of  v.'s  is  crucial  to  the  success  of  the 

J 

i nvers i on . 


To  deal  with  nonuniqueness  and  instability,  linear  inverse 
methods  proceed  with  a  spectral  decomposi tion  of  F ' ,  that  is  the 


singular  value  decomposition  (SVD)  of  F'  (Lanczos,  1961).  The  SVD 
gi  ves 

F'  =  U  A  VT  ,  (4.17) 


where  A  is  an  mxm  diagonal  matrix  with  nonnegative  elements,  and  U 

anc  V  are  mxm  and  nxm  matrices,  respectively.  The  jth  diagonal 

element  (at  the  jth  row  and  jth  column)  of  A  is  the  jth  singular 

o 

value  x.  or  the  square  root  of  the  jth  eigenvalue  x.  of 
J  J 

either  one  the  following  eigenvalue-eigenvector  problems: 

rr\  =  jsl  ,2, . . .  ,m,  (4.18a) 


or 


rllrl 

F-  F-  -j 


xj-j: 


.  ,n , 


(4.18b) 


witn  Xj by  convention.  The  solution  for  the  eigenvectors 
v,  of  equation  (4.18b)  is  the  choice  of  the  set  of  basis  vectors 

J 

for  jj'  in  equation  (4.16).  The  jth  columns  of  U  and  V  are  the 
eigenvectors  u  -  and  v.,  respectively.  Notice  that  x  .=0  for 

J  J  J 

j>m,  and  the  corresponding  null-space  eigenvectors  v. 's  with  j>m, 

even  though  are  constructible ,  they  are  not  included  in  V  in  the 

decomposition  of  F1 .  It  is  because  they  are  not  resolvable  or 

constrained  by  the  data:  Any  combination  of  the  null- space 

eigenvectors  is  a  solution  to  the  homogeneous  equation  f'£'=0,  anc 


they  are  the  reason  for  nonuniqueness.  At  this  stage,  a  good 


strategy  1s  t0  ignore  the  null-space  completely  and  accept  the 

unique  particular  solution  £p'  as  the  appro,  imate  solution  for 

£*  .  By  suostituting  (4.16)  and  (4.17)  into  (4.15)  with  v'  set  to 

zero,  and  using  the  equalities  (U  and 

J  J  J 

(U  A  VT)Tui=x.vi  and  the  orthonormal ity  of  the 
eigenvectors,  we  obtain  3j=Xj^(Ujy' ) ,  and  hence, 

m 

E'p- 

j  =  l 

When  there  are  less  than  m  independent  equations  in  the  system,  the 
number  of  nonzero  singular  values  (the  rank  of  the  system)  is 
actually  less  than  m,  and  this  corresponds  to  a  larger  null- space. 

Unfortunately,  £^‘  is  not  a  stable  solution.  As  can  be  seen 
in  equation  (4.19),  the  effect  of  the  noise  in  y‘  is  magnified  by 
the  vanishingly  small  singular  values.  These  appear  because  of  the 
ill-conditioning  of  the  system,  i.e.  noise  in  the  model  F'  and 
almost  redundant  information  in  the  data.  The  useful lness  of  the 
SVD  is  now  obvious:  it  provides  a  meaningful  set  of  basis  vectors 
for  £' ,  in  which  the  stable  and  unstable  components  (vectors)  are 
well  distinguished  by  the  sizes  of  their  singular  values.  Thus,  a 
stable  approximate  solution  £' *  can  be  obtained  by  discarding  or 
down-weighting  the  unstable  components.  A  down- weighting  technique 
is  to  modify  equation  (4.19)  to 


where  a  is  analogous  to  the  Lagrange  multiplier  of  the  variational 
method,  representing  the  signal  to  noise  ratio,  and  its  value  is 
progressively  adjusted  until  the  residuals  are  acceptable  and  the 
solution  is  stable.  This  is  identical  to  filtering  the  particular 
solution  by  a  low-pass  filter  because  the  unstable  components  are 
usually  more  oscillatory;  indeed,  a  smoothed  version  of  the 
parti afar  solution  is  obtained.  This  smoothed  solution  is  stable 
to  noise  and  it  is  a  good  approximation  when  the  true  solution  is 
also  smooth. 

Replacing  £* *  with  S1//2£*  in  equation  (4.20),  where  £*  is  the 
estimate  to  the  original  parameters  £,  and  recasting  the  equation 
back  into  matrix  form,  one  obtains 

£*  =  (FTW  F+SrVW  (4.21) 

with  a  set  to  unity.  Note  that  one  can  always  make  a=l  by  rescaling 
W  and  S.  The  stochastic  analogue  of  the  inverse  methods  is 
disclosed  by  realizing  that  the  linear  inverse  solution  shown  in 
equation  (4.21)  is  actually  identical  to  the  MPDE  evaluated  by 
minimizing  s(£)  of  equation  (4.7)  with  pQ=0  and  f (£_)=£  £, 
providing  that  the  inverse  covariance  matrix  of  noise  and  the 


inverse  a  priori  covariance  matrix  of  p  are  exactly  equal  to  W  and 

S,  respecti vely.  Since  the  MPDE  has  the  statistical  property  of 

efficiency  (minimum  variance)  in  the  case  of  1  inear  systems,  the 

linear  inverse  solution  becomes  the  linear  minimum  variance  estimate 

when  W,  S  and  a  are  identical  to  C"1,  C'1  and  unity, 

—  —  —v  —  £ 

respectively  (Cornuelle,  1983). 

It  was  mentioned  earlier  that  a  priori  information  in  the  form 
of  a  pdf  is  required  to  provide  uniqueness  and/or  stability  in  the 
stochastic  methods  when  the  system  is  underdetermined  anchor 
ill-conditioned.  There  are  no  exceptions  in  either  the  variational 
or  inverse  methods  except  that  the  prior  information  comes  in  an 
equivalent  but  non-probabil  istic  form,  which  is  the  statement  that 
tne  continuous  function  to  be  estimated  is  smooth. 

We  have  shown  the  equivalence  of  the  MPD,  variational  and 
inverse  methods.  Therefore,  someone  interested  only  in  the  final 
solution  and  computional  efficiency  would  no  doubt  formulate  the 
estimation  procedure  within  the  context  of  optimizing  objective 
functions.  However,  many  geophysicists  prefer  the  less  efficient 
but  more  powerful  spectral  decomposition  technique.  Unlike  the 
objective  function  approaches,  in  which  the  information  of 
smoothness  is  incorporated  right  at  the  beginning  of  and  during  the 
optimization  process,  the  spectral  decomposition  approach  does  not 


use  this  information  until  the  whole  spectrum  of  solutions  is 


oDtained.  From  there,  the  resolution  of  each  parameter  and  the 
distribution  of  independent  information  are  simultaneously  provided 
by  the  spectral  decomposition:  the  jth  column  of  the 
solution- resolution  matrix  V  indicates  how  well  a  delta 
function  located  at  the  jth  column  of  £  can  be  resolved,  and  the  jth 
column  of  the  data- resolution  matrix  U  describes  the 
distribution  of  the  jth  independent  piece  of  information  in  the 
data.  The  dravrfjack  with  spectral  expansion  techniques  is  that  they 
are  not  applicable  to  systems  that  are  not  linear  or  cannot  be 
1 inearized. 


4,5  Methods  for  Minimization 


In  order  to  focus  our  attention  on  the  minimization  of  the 
objective  function  s(£)  of  equation  (4.7),  we  revise  £  to  include 
both  physical  parameters  and  design  parameters,  and  to  include 
both  experimental  data  and  erroneous  design  data,  when  necessary. 
We  would  like  to  emphasize  that  the  minimization  of  s(£)  is  a 
generalized  estimation  procedure  of  many  estimation  methods.  Some 
widely  used  numerical  techniques  of  minimization  for  getting  the 
optimal  estimate  are  reviewed  and  discussed  in  this  section. 

4.5.1  Linear  System 


The  location  of  tne  unique  minimum  of  the  objective  function 
s(£)  of  (4.7)  for  the  1  inear  system  in  (4.14)  can  be  evaluated, 
analytically,  as 


£*  = 


(4.22a) 


where 

H  =  (FTc;1F  ♦  O’1) 


(4.22b) 


is  tne  Hessian  (the  matrix  of  second  derivatives)  of  s(£).  It  can 
oe  evaluated  prior  to  the  finding  of  £*  in  a  linear  system  because 
it  is  not  a  function  of  £*.  The  solution  £*  exists  providing  that 


the  inverse  of  H  exists.  The  most  complicated  step  in  solving  for 
£*  is,  therefore,  to  invert  H.  Gaussian  elimation  and  some  of  its 
variations  such  as  the  LU  and  LlJ  decompositions  which  are  more 
convenient  for  numerical  implementations  are  generally  used  to 
perform  the  task{ Oahlquist  and  Bjorck,  1969). 

On  the  other  hand,  the  problem  can  also  be  solved  by  using  the 
more  powerful  although  less  efficient  SVD  as  discussed  earlier,  so 
that  resolution  and  information  distribution  can  also  be  analysed. 

In  order  to  use  the  SVD,  the  eigenvalue-ei genvector  problems  of 
(4.18a)  and  (4.18b)  must  be  attacked.  This  causes  loss  of 
efficiency  because  finding  eigenvalues  is  a  time  consuming  task. 

The  numerically  stable  QR  algorithm  for  finding  eigenvalues  and  the 
inverse  iterative  methods  for  evaluating  eigenvectors  are 
recommended  (Acton,  1970). 

4.5.2  Nonlinear  System 

Numerous  methods  for  minimization  have  been  developed  in  recent 
years,  but  there  is  no  single  scheme  that  works  for  all  problems.  A 
method  may  work  well  for  one  type  of  objective  function  but  fail  for 
another  type.  However,  most  of  the  methods  are  iterative  in  nature, 
requiring  an  external  initial  guess  tl  .'or  the  minimum-point  £*, 
and  then  generating  an  Internal  sequence  of  points  at  £*£1-  with 
1  =  2,3,...,  progressively ,  which  hopefully  converges  to  £*.  An 
iteration  is  the  process  of  generating  a  new  point  in  the  sequence. 


All  iterative  methods  are  based  on  the  fundamental  reasoning 
described  oelow. 

At  the  ith  iteration,  s(j>)  near  ^  may  be  evaluated  as 


s(&i+!R)=si+ii!R+{  1/2)  j«p |3  ), 


(4.23) 


where  6£  is  a  small  vector  displacement  from  £..,  s.*s(p^),  and 
£i=9.^Ri^  3,1(1  are  the  gradient  vector  and  Hessian  of 

s(j>)  evaluated  at  p  ,  respecti vely.  Suppose^  is  in  the 
quadratic  region  surrounding  £*,  so  that  terms  of  order  $£  ^  are 
negligiole  and  5(£.1+«£)  can  be  expressed  as 


2(Ri+«R)  =  Ri  +  Hi  «R- 


(4.24) 


Since  £=0  at  £*,  the  step  (vector  displacement)  that  reaches  £*  is 


1 R  =  -ili  £i 


(4.25) 


The  Newton-Raphson  { N-R )  method  adopts  the  above  scheme 
explicitly,  by  setting  the  ith  step  iE=Ri+i"Ri  exactly  equal  to 
-Hi^Si*  The  N_R  method  works  well  for  weakly  non-linear 
systems,  and  in  fact  it  works  perfectly  in  a  linear  system  by 
requiring  only  one  iteration.  Unfortunately,  it  also  fails  to  work 
in  many  cases  due  to  two  major  weaknesses.  First,  the  method  is 


mathematically  unstable,  that  is  it  does  not  guarantee  convergence 
to  a  minimum,  because  an  N-R  step  may  not  be  an  acceptable  step.  An 
acceptable  step  is  a  "down-hill"  step  such  that  s{£1-+1)<s(£^); 
covergence  can  only  be  guaranteed  if  all  the  steps  are  acceptable. 

A  down- hill  step  is  ensured  if  it  is  taken  along  a  down- hill 
direction  d^  such  that 

dj  =  -Giai ,  (4.26) 


where  G^  is  an  arbitrary  but  positive  definite  nxn  matrix. 

Realizing  that  the  ith  step  direction  of  the  N-R  method  is  the  one 
given  in  equation  (4.25)  with  H".1  replacing  G  ,  and  since  Hi 
can  be  nonpositive  definite  when  £^  is  not  inside  the  quadratic 
region,  stepping  up-hill  is  highly  possible  along  a  N-R  direction. 
The  second  major  weakness  is  that,  at  each  iteration,  the  method 
requires  the  evaluations  of  besides  g^-,  and  in  addition,  it 
also  requires  the  inversion  of  H  .  The  analytical  expression  of  H 
as  a  matrix  function  of  £  is  quite  often  very  difficult  to  derive, 
hence  the  evaluations  of  H  at  g^-'s  place  a  heavy  burden  on  the 
user.  This  cannot  be  too  pleasing  when  approximately  n  /2 
complicated  function  evaluations  are  needed  at  every  iteration,  not 
to  mention  the  heavy  computational  burden  of  inverting  large 
matri  ces . 


In  view  of  the  defects  of  the  N-R  method,  stable  and  more 
efficient  methods  have  been  sought  by  many  mathematicians.  As  a 
consequence,  iterative  descent  gradient  methods  (gradient  methods 
for  short)  were  developed.  These  methods  abate  the  burdens  on  both 
the  user  and  computer  by  requiring  only  the  evaluations  of  ^-'s 
but  not  H^'s  and  H"-1 ' s .  More  attractively,  gradient  methods 
are  stable  in  general  cases.  Hence,  the  two  major  weaknesses  of  the 
N-R  method  disappear  in  gradient  methods. 

At  each  iteration  of  all  the  gradient  methods,  a  down- hill 
direction  is  selected  at  the  current  point  and  then  an  acceptable 
step  is  taken.  This  is  the  reason  for  their  stability.  The  ith 
step  direction  is  evaluated  by  equation  (4.26)  and  d^  is  always  a 
down-hill  direction  because  the  positive  definiteness  of  is 
ensured  by  the  methods.  Different  gradient  methods  choose  the  step 
directions  (or  G^'s)  differently  but  a  similarity  of  all  is  that 
second-derivative  information  is  estimated  and  incorporated  in  G^ 
at  each  step,  which  gradually  evolves  to  become  the  inverse  Hessian 
at  £*  so  that  equation  (4.26)  also  evolves  to  become  equation 
(4,25).  As  a  result  of  this,  g*  is  located.  Gradient  methods 
oasically  fall  into  two  categories:  (1)  those  that  require  all  the 
gj  's  to  be  the  minimum-points  along  d^'s,  for  example,  the 
method  of  Fletcher  and  Powell  (1963),  and  (2)  those  that  take 
acceptaole  steps  but  not  necessarily  reaching  the  minimum-points 
along  d^ 's  in  all  the  steps,  for  example,  the  Marquardt's  method 
(1963).  The  trade-off  is  that  the  former  method  requires  more 


function  evaluations  per  iteration  but  less  iterations  while  the 
latter  methods  require  less  function  evaluations  per  iteration  but 
more  iterations.  However,  the  total  number  of  function  evaluations, 
which  determines  the  efficiency  of  a  method,  is  usually  the  same 
order  of  magnitude  for  the  two  different  approaches. 

Fletcher  and  Powell's  method  (1963)  is  used  in  our  study.  Its 
use  is  solely  a  matter  of  preference,  and  we  do  not  claim  that  it  is 
the  best  method  for  the  investigation  since  we  have  not  tested  other 
methods.  However,  we  have  found  its  performance  to  be  more  than 
satisfactory.  Although  the  method  requires,  at  each  iteration,  to 
step  to  the  minimum-point  along  the  selected  direction,  it  is  still 
quite  efficient  because  few  iterations  are  needed.  It  can  be  shown 
that  if  j^.  is  within  the  quadratic  region  of  a  minimum,  the  method 
tnen  only  requires  at  most  n  more  iterations  to  converge  to  the 
minimum,  where  n  is  the  number  of  unknown  parameters  of  s(jd).  The 
method  takes  very  small  steps  at  the  beginning  of  the  minimization 
process  but  follows  with  rapid  descent  after  H(£*)-1  has  been 
closely  approximated  by  . 
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4.6  Error  Of  The  Estimate 

An  estimate  has  no  meaning  by  itself  since  it  should  not  be 
trusted  for  the  interpretation  of  the  true  physical  situation 
without  the  knowledge  of  its  error.  To  investigate  whether  an 
estimate  is  well  or  ill  determined,  one  can  either  employ 
nonstatistical  response  surface  techniques  (Bard,  1974)  or, 
similarly,  tne  statistical  analyses  of  variance  (Jenkins  and  Watts, 
1909,  Bard,  1974) . 

In  the  response  surface  technique,  we  say  that  there  is  no 
reason  to  prefer  the  minimum  at  £*  as  the  solution  over  any  other 
value  of  £  for  which 

s(£)-s(£*)  ~  1/2  (£-£*)TH(£*) (£-£*)  <  e,  (4.27) 

where  e  is  an  arbitrary  small  constant  and  ~  is  replaced  by  =  when 
the  system  is  linear,  so  that  the  larger  (smaller)  the  diagonal 
elements  of  H(£*)  (H(£*)_1)  are,  the  better  the  corresponding 
parameters  are  estimated. 

On  the  other  hand,  statistically,  an  approximation  of  the 
logarithm  of  the  posterior  distribution  can  be  expressed  as 

1  ogLP ( £  jy* )  J  oc  - 1/ 2  (£-£*)TH(E*  )(£-£*) .  (4.28) 

This  corresponds  to  approximating  the  posterior  distribution  with  a 


normal  distribution ,  and  the  approximation  is  good  near  g*  if  the 
objective  function  is  symmetric  at  the  minimum.  The  tails  of  the 
distribution  are  of  no  concern  in  the  error  analyses.  Assuming  that 
£*  is  quite  close  to  the  expectation,  it  then  follows  that  an 
approximation  of  the  covariance  matrix  of  the  error  of  the  estimate 
is 


Thus,  the  diagonal  elements  of  H ( g* ) “ ^  are  approximately  the 
variances  of  the  estimates  of  the  parameters. 

It  was  shown  that  whether  considering  statistics  or  not, 

H(£*)"i  is  the  important  measure  of  error  of  £*.  We  would  like  to 
mention  that  a  problem  in  design  is  to  pre-arrange  the  design 
parameters  so  that  the  diagonal  elements  of  H(g*)_1  are  minimized 
for  an  expected  range  of  possible  values  of  g*.  This  design  problem 
is  easier  to  tackle  if  the  system  is  linear  since  in  this  case  H 
does  not  depend  on  £. 

Since  H(g*)  is  not  a  diagonal  matrix  in  general,  the  errors  of 
different  parameters  can  be  correlated.  However,  it  is  of  interest 
in  many  aspects  of  error  analyses,  for  example,  statistical 
inference,  to  look  at  uncorrelated  errors.  As  a  result,  a  linear 


transformati  on 
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£"  =  QT£ 


(4.30) 


of  the  original  parameters  £  is  often  exercised  so  as  to  bring  about 
uncorrelated  errors  of  the  transformed  parameters;  Q  is  an  nxn 
matrix.  In  other  words,  uncorrelated  errors  of  linear  combinations 
of  the  original  parameters  are  analysed.  These  orthogonal 
combinations  can  be  found  by  a  SVD  of  H(£*)  or  H(£*)~*.  Let  us 
consider  the  decomposition  of  H<£*)  such  that 


H(£*)  =  Q  D  QT, 


(4.31) 


where  D  is  a  nxn  nonnegative  definite  diagonal  matrix  consisting  of 
the  nonnegative  singular  values.  The  substitution  of  equation 
(4.26)  in  equation  (4.24)  with  |>"*QT£  and  £"*=QT£*  gives 


logLP(£"  y * )  j  oc  - 1/ 2  (£"-£"*) -£"*), 


(4.32) 


where  £"*  is  the  estimate  of  £".  It  is  seen  that  D  is  the 
covariance  matrix  of  the  error  of  £"*,  and  the  errors  of  these 
transformed  parameters,  which  are  linear  combinations  of  the 
original  parameters,  are  not  correlated  because  D  is  diagonal. 

In  a  nonlinear  system,  since  the  posterior  distribution  may  not 
be  uni  modal ,  many  initial  guesses  are  required  in  the  minimization 
procedure  inorder  to  expose  the  least  minimum  or  to  see  if  all  of 
them  converge  to  the  same  £*.  If  more  than  one  qlobal  minimum  is 
found,  the  estimation  problem  is  nonunique. 


i. 


w 


4.7  Goodness  Of  A  Model 


Even  more  important  than  judging  the  reliabilty  of  £*  is 
judging  the  reliability  of  a  model.  The  goodness  of  a  model  can  be 
assessed  by  analysising  its  final  residuals.  A  model  can  never  be 
proved  correct  in  principle,  but  it  can  be  proved  incorrect  or 
inconsistent  or  inferior  to  other  models.  Some  uniform  measures  of 
goodness  based  on  the  final  residuals  can  be  evaluated  for  different 
out  plausible  models,  which  can  then  be  compared  to  discriminate 
between  models  and  various  hypotheses. 

If  a  model  is  accurate  and  parameters  are  well  determined,  the 
residuals  will  reflect  the  experimental  random  noise.  In  fact, 
residuals  are  oiased  estimates  of  noise:  they  should  be  smaller  than 
the  actual  random  error  on  the  average  (Bard,  1974).  Some  of  the 
most  common  tests  on  residuals  in  time  series  are  Chi-square 
goodness-of-fit,  runs  and  correlation  tests  (Bendat  and  Piers ol  , 
1971),  which  are  used  to  confirm  a  model  by  verifing  noise 
statistics  such  as  normality,  stationarity  and  lack  of  correlation. 
However,  tnese  tests  are  not  applicable  when  only  a  few  realizations 
of  the  same  random  variable  are  made,  as  is  usually  the  case  in 
expensive  oceanographic  experiments,  for  example  CTD  surveys. 
Fortunately,  in  model  discrimination ,  there  is  less  interest  in 
knowing  hew  well  the  residuals  of  the  best  model  resemble  the  noise 
properties  than  in  knowing  how  well  the  data  are  resolved  by  the 
best  model  as  compared  to  the  other  models;  keeping  in  mind  that 


some  of  the  noise  properties  are  assumptions  anyway. 

In  what  follows,  we  present  some  unsophisticated,  yet  very 
useful,  measures  of  goodness  of  model,  which  are  often  sufficient  to 
serve  the  purpose  of  model  discrimination. 

The  simplest  of  all  measures  is  the  weighted  sum  of  products  of 
the  final  residuals,  that  is 

R  =  c  Jc^e,  (4.33) 

where  R  is  a  Chi-square  distributed  random  variable  with  m  degrees 
of  freedom,  e  is  the  final  residual  vector,  and  the  adjusting  factor 
c  is  the  total  number  of  experimental,  a  priori,  and  erroneous 
design  data  divided  by  the  same  number  less  the  total  number  of 
unknown  parameters.  If  a  priori  data  are  available  and  used  in  the 
oojective  function  of  equation  (4.7),  then  c^n+r/fm+n+rj-tn+r) 
=m+rr-r/m.  If  a  priori  data  are  not  used,  then  c=m+r/(m+r)-(n+r) 
=m+r/m-n,  where  r  is  the  number  of  erroneous  design  data  or 
parameters.  The  factor  c  is  needed  to  adjust  the  inverse  covariance 
matrix  of  noise  to  equal  that  of  the  final  residuals  due  to  the  bias 
(Bara,  1974).  A  significance  level  can  be  selected  for  rejecting 
models  on  the  two  edges  of  the  distribution. 

In  general,  the  smaller  the  misfit  between  the  data  and  the 
model  ,  the  better  the  model  and  the  resolution  in  the  solution  (or 
parameter)  space  are.  However,  care  must  be  taken  when  the  misfit 
is  extremely  small  ,  because  we  may  have  an  unstable  system  instead 
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of  a  perfect  model.  Backus  and  Gilbert  (1970)  have  shown  that 
traae-off  between  resolution  and  the  statistical  reliability  of  the 
estimate  exists  when  noise  is  present.  Moreover,  for  an 
ill-conditioned  system,  the  variance  of  the  estimate  increases 
without  bound  as  resolution  is  pushed  beyond  the  limit  imposed  by 
tiie  data.  Thus,  a  model  is  acceptable  if  and  only  if  both  the 
variance  and  R,  which  is  a  measure  of  misfit  and  hence  of  resolution 
also,  are  acceptable. 

To  illustrate  the  trade-off  between  resolution  and  reliability, 
consider  the  linear  system  (4.15).  An  estimate  may  be  constructed 
using  (4.19),  where  the  basis  vectors  vi  are  weighted  and  then 
summed  to  give  the  estimate.  Since  the  weighting  on  v  -  is  the 

J 

product  of  the  inverse  singular  value  x".1  and  the  projection  of 

J 

tne  observations  y‘  onto  the  corresponding  eigenvector  u-  in  the 

J 

data  space,  the  small  x.'s  can  translate  the  experimental  noise 

J 

into  large  estimation  errors.  Clearly  then,  the  reliability  of  the 
estimate  can  only  be  improved  by  degrading  the  resolution,  that  is 
discarding  or  down-weighting  the  v.'s  that  have  small  x.. 

J  J 

There  are  two  other  measures  which  are  often  used  to  judge  the 
success  of  a  model  in  predicting  (interpolating)  data.  They  are  the 
correlation  coefficient  between  observed  and  predicted  signal 


(4.34) 


and  the  amount  of  signal  energy  resolved  by  the  prediction 
fT(x*,^)C_1f 

E  =  -  x  100  percent.  (4.35) 


The  larger  C  and  E  are,  the  better  the  model  fits  the  data,  but 
again,  these  two  measurements  can  be  misleading  in  the  case  of 
in  stab  il  ity. 

The  similarities  in  shape  and  amplitude  between  the  observed 

signal  and  the  model  prediction  are  measured  by  C  and  E, 

respectively.  In  general,  C  and  E  are  independent,  but  for  a  least 

1/2 

squares  minimization,  1Q0C  equals  Ev  for  the  total  set  of  data 
points.  However,  for  individual  subsets  of  data,  C  and  E  remain 
useful  separate  pieces  of  information. 


CHAPTER  5 


ESTIMATION  OF  WAVE  PARAMETERS  AND  WAVE  DYNAMICS  (1): 
METHOD  AND  RESULTS 


5.1  The  Estimator 

Our  parameter- estimati on  problem  can  be  phrased  as  the  inversion 
of  the  sound-speed  perturbations  $c  based  on  the  data,  and 
contrained  by  the  dynamics  of  narrow-band  planetary  waves.  A 
consequence  of  the  addition  of  the  dynamical  constraint  on  6C  is  the 
modification  of  the  system  to  be  inverted  from  highly 
underdetermined  to  highly  over determined.  For  a  small  number  of 
waves,  the  system  can  be  well-conditioned  as  well.  It  was  the 
expectation  of  a  small  number  of  waves  and  of  a  well -condi tioned 
system  that  led  us  to  use  the  MLE  estimator  instead  of  the  MPD 
estimator.  It  was  learnt  in  the  estimation  process  that  as  the 
number  of  waves  W  increases,  the  condition  of  the  system 
deteriorated.  However,  this  has  no  effect  on  our  investigation, 
because  the  optimal  wave  fit,  corresponding  to  W=3,  was  unique  and 
wel 1-determined. 

With  reference  to  the  discussions  in  Sec.  4.3,  the  MLE  estimator 

can  be  formulated  as  the  minimization  of  an  objective  function  (i.e. 

likelihood  function).  Treating  the  130  modal-ampl  i tude  data 

(a?;  j=l . 130),  the  7  time  series  of  sound-speed 

J 

perturbations  ( «c?.  =«c°(t=3k  days);  k=0,...,26  and 


j=l,...,7)  and  the  58  time  records  of  travel -time  perturbations 


Utjk=6tj(t=9k  days);  k=0,...,8  and  j  =  l . 58)  as  3 

indepenaent  data  subsets  (refer  to  Table  3.2  for  the  sources  of 
data),  and  further  treating  the  uncertainty  in  the  nominal 
horizontal  positions,  ax,  and  the  unknown  horizontal  displacements 
( ax^axf^k  days);  k=Q,...,8)  of  the  acoustic  moorings  as  errors 


in  the  design  parameters,  the  objective  function  can  be  cast  as  a 
sum  of  5  constraining  functions  of  similar  forms  of  weighted  sum  of 
square  of  residuals.  Because  there  were  9  acoustic  moorings,  ax  and 
ax.  are  18-dimensional  vectors,  and  we  denote  their  jth  components 


by  axj  and  ax ^ ,  respectively.  The  objective  function  can  thus 
be  expressed  as 


s(£,ax,ax0,...,ax8)  =  s^E^s^R^s^.ax.a^ . axg) 

+sax(^)  +  sax(^0 . ^8)  (5*la) 


sa  =  1/2 


-2  r  O  m»  >  -i2 

'’a.jCaraj<*)]  • 


(5.1b) 


s«  ■  1/2 


/  C  O 

7  7 


-2  r  o  m  ,  n2 
°6C>jk[5Cjk-4':jklE»  - 


(5.1c) 


K 
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«sq 


m 


°«t  ,jk  Cdtjk  _dtjk  (5*ld) 


representing  the  constraints  imposed  by  each  of  the  data  subsets  on 
a  wave -propagation  model  (Model  0,  1  or  2)  that  is  characterized  by 
a  corresponding  set  of  unknown  parameters  £  as  described  in  Sec. 
3.4,  where  a™,  6Cjk  and  6tjk  are  defined  in  (3.19), 

(3.20)  and  (3.24),  respectively.  Furthermore, 


-2  2 
a  .  AX . 
AX.J  J 


( 5 .1  e ) 


18  8 


s  =1/2 

<5X 


-2  2 
°fix  ,jk5x jk 


(5. If) 


j=l  k=0 


represent  the  constraints  imposed  by  the  erroneous  design  data  on 
the  incorrect  horizontal  mooring  positions.  In  writing  down  (5.1b) 
to  (5. If),  we  have  assumed  uncorrelated  experimental  noise  and 
desi  gn -parameter  errors,  with  the  variances  of  a°,  <sc  °k , 

At  jk  *  ax  j  and  <5Xjk  being  denoted  by  ojj,  o^.^, 

<4,jk'  and  4,j’  r«pect1v.ly.  If  a  priori 
information  on  £  were  incorporated  in  the  estimator,  (5.1a)  would 
have  an  additional  constraining  function,  again  of  a  similar 


•  -  -  «  •  •*  »  r  • 


152 

form.  The  minimum  point  would  then  be  the  MPD  estimate  and  the 
variance  of  the  estimate  should  be  reduced. 

Although  a  priori  information  was  not  incorporated  explicitly  in 
the  estimator,  it  was  utilized  in  many  related  occasions.  An 
implicit  usage  was  in  the  filtering  and  reductionof  the  data  (Sec. 

3.3).  On  the  other  hand,  the  optimization  orminimizati on  of  (5.1) 
was  facilitated  by  reasonable  initial  guesses  of  £  that  are 
consistent  with  the  prior  information,  for  example,  the  guessed 
wavelengths  are  of  order  100  km  and  the  guessed  wave  amplitudes  are 
of  order  meters  per  second. 


5.2  Assignment  Of  Noise  Variances 

In  almost  any  parameter- estimati on  problem,  the  variances  of 
measurement  noise  are  generally  fairly  accurately  known.  On  the 
other  hand,  one  usually  has  less  idea  or  no  idea  at  all  of  what  the 
variances  of  the  model  noise  might  be,  especially  when  the 
estimation  problem  corresponds  to  model  identification.  However, 
this  inexact  knowledge  of  noise  statistics  does  not  in  general 
introduce  any  major  obstacle  in  solving  estimation  problems.  There 
are  two  reasons  for  this:  first,  most  estimators  are  not  sensitive 
to  slight  variations  in  noise  variances,  thus  as  long  as  the 
assigned  variances  are  within  reasonable  ranges  of  the  true 
variances,  the  estimate  will  not  be  greatly  affected.  Second,  all 
estimators  also  generate  an  estimate  of  noise,  besides  an  estimate 
of  the  parameters,  so  that  one  can  rely  on  the  noise  estimates 
themselves,  that  is  the  final  residuals,  for  refinement  of  the 
assigned  variances  in  an  iterative  estimation  process  when 
necessary.  The  assignment  of  the  noise  variances  in  (5.1)  is 
described  below.  The  assigned  values  were  later  found  to  be 
consistent  wi th  the  final  residuals,  i.e.  the  final  residuals  are 
not  consistently  larger  or  smaller  than  the  assigned  standard 
deviations . 

By  analysing  numerous  sound-speed  profiles  acquired  by  Piips 
(1967)  between  Bermuda  and  Eleuthera,  Mooers  (1974)  found  strong 
evidence  for  the  existence  of  a  first  baroclinic  semidiurnal  tide 
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with  an  amplitude  of  0.7  m/s  in  «c  at  550  m  depth.  Furthermore, 

nonlinear  and  higher-mode  perturbations  are  neglected  in  all  3 

wave- propagation  models  {Table  3.3).  These  neglected  higher-order 

perturbations  combined  with  the  internal  tide  are  probably  the  major 

contributors  to  model  error.  While  the  errors  in  the 

modal-ampl  itude  data  are  most  sensitive  to  the  internal  tide  due  to 

the  lack  of  temporal  filtering,  the  errors  in  the  filtered  «c  time 

records  are  most  sensitive  to  higher-mode  fluctuations.  In 

addition,  the  ac  time  records  are  also  subject  to  errors  caused  by 

vertical  mooring  motion.  We  guess  that  the  's  and  a  .'s 

oc ,jk  a,j 

should  be  roughly  1  n^s,  and  thus  have  set  a,  .=  5,,.  .,,=1  m/s  for 
all  j  and  k . 

Considering  the  measurement  noise  and  internal  waves  and  tides 

alone,  Cornuelle  et  al .  (1985)  have  estimated  the  daily  mean 

variance  of  travel-time  noise  to  be  3.6  ms2.  In  order  to  include 

the  errors  introduced  by  the  neglected  higher-mode  perturbations  and 

current  effects,  and  the  assumption  of  travel-time  linearity  in  our 

models,  we  have  added  5C  ms  to  their  estimated  variance,  that 
2  2 

is  we  have  made  afit  ^=3.6+25  ms  .  We  note  that  some  of  the 
travel  times  were  missing  or  not  resolvable  from  the  58  ray  paths 
used  on  some  particular  days,  (especially,  during  the  later  period,) 
and  in  these  cases  we  set  the  corresponding  variances  to  infinity. 

The  available  tracking  data  indicate  that  the  horizontal  mooring 
displacements  were  of  order  200  m.  Therefore,  we  have  set 
°6X  jk=2o°  m  and  20  m  ^or  the  11,1  tracked  and  tracked  displacements. 


respectively.  The  20  m  standard  deviation  represents  the 
measurement  error  expected  from  the  navigation  systems.  We  have 
further  set  a  -=500  m  for  all  j,  which  is  a  reasonable  value  as 
indicated  by  the  observed  travel-time  perturbations. 
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5.3  Resul ts 

The  iterative  descent  gradient  method  of  Fletcher  and  Powell 
(1963)  was  used  for  the  wave  fits,  that  is  the  optimization  of  (5.1) 
witn  different  wave-propagation  models  and  numbers  of  waves.  In 
each  minimization,  after  accepting  an  initial  guess  of  the  unknown 
parameters  u=(£,ax, sx^ , . . .  ,«Xq)  ,  the  method  then  proceeds  to 
locate  a  minimum  by  estimating,  progressively,  the  inverse  Hessian 
matrix  H*"^  (i.e.  the  inverse  of  the  matrix  containing  the  second 
derivatives)  of  s(u)  at  the  minimum  point  u*  (Sec.  4.5.2).  Thus  an 
estimate  of  u,  u*.  and  an  estimate  of  the  error- covariance  matrix  of 
u*,  H* ,  are  generated,  simultaneously. 

For  each  of  the  three  models,  one  to  five  waves  were  fitted  to 
the  data.  At  least  four  different  initial  guesses  of  £  for  a  given 
model  (Model  0,1  of  2)  and  number  of  waves  (W=l,2,3,4  or  5)  were 
used  in  the  optimizations  to  explore  the  least  minimum  (i.e.  the 
solution)  and  to  investigate  nonuniqueness.  All  the  initial  guesses 
of  ax  and  ax^  were  null  vectors.  While  the  wave  fits  with  W<3  are 
unique,  those  with  W>3  are  not.  In  each  fitting  with  W<3,  most  of 
the  initial  guesses  converged  to  the  same  stationary  point  where  the 
least  minimum  occurs,  and  although  a  few  initial  guesses  converged 
to  different  stationary  points,  the  corresponding  minima  are 
considerably  larger.  For  each  of  the  wave  fits  with  W>3,  different 
initial  guesses  resulted  in  different  minima  of  approximately  the 
same  size,  hence  a  unique  least  minimum  could  not  be  identified. 
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The  change  from  uniqueness  to  nonuniqueness  as  W  increases  is  a 
demonstration  of  the  trade-off  between  resolution  and  stability.  As 
W  increases,  so  do  the  magnitudes  of  the  wavenumber  estimates. 

Thus,  finer- scale  structures  of  the  perturbations  are  intended  to  be 
resolved  with  a  larger  W,  but  because  of  the  inadequacy  of  the  data 
in  resolving  them,  the  system  for  sc  is  rendered  underdetermined. 

InCornuelle's  (1983)  time-independent  inversions,  no  dynamical 
constraint  is  imposed  on  the  solution  for  sc,  and  in  order  to  ensure 
uniqueness,  he  incorporates  an  a  priori  covariance  of  sc  that  is 
assumed  to  be  horizontally  Gaussian  with  a  decay  scale  of  100  km  in 
the  estimator.  This  is  the  same  as  requiring  the  solution  to  be 
smooth  in  space.  Cornuelle  points  out  that  the  solution  for  sc  is 
not  sensitive  to  small  variations  in  the  assumed  spatial  decay 
scale.  We  have  encountered  a  similar  situation  in  our 
time -dependent  inversions.  An  interesting  fact  is  that  although  the 
wave-parameter  estimates  are  nonunique  in  the  cases  of  W=4  and  5, 
the  solution  for  the  corresponding  sc  is  unique.  That  is,  the 
estimated  fields  of  sc,  and  the  amounts  of  resolved  data  variance 
associated  with  the  different  stationary  points  are  practically  the 
same.  Indeed,  the  constraints  imposed  by  the  wave  dynamics  are 
analogous  to  the  criterion  of  smoothness,  the  diiferent  stationary 
points  are  analogous  to  the  variations  of  the  decay  scales  in  space 
and  time,  and  a  time-dependent  inversion  is  not  sensitive  small 
variations  of  Doth  decay  scales. 
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In  order  to  assess  and  compare  the  different  wave  fits  so  that 
the  optimal  model  and  W  may  be  identified,  a  simple  measure  of 
goodness,  that  is  the  weighted  sum  of  squares  of  final  residuals  R, 
defined  in  (4.27),  was  computed  for  each  of  the  wave  fits.  When  the 
estimate  is  close  t.o  the  true  value,  the  probability  distribution  of 
the  final  residuals  is  approximately  »  "■'mal  (because  the  noise 
distrioution  is  normal )  and  the  covariances  of  the  residuals  and  the 
noise  are  approximately  proportional  to  each  other  (Ch.  4,  Sec. 

4.3).  Thus,  R  is  approximately  a  Chi-square  distributed  random 
variable  with  ntb41  degrees  of  freedom  where  m  is  the  number  of 
data,  and  the  0.01  significance  level  of  the  random  variable  is  at 
R~94Q.  In  Fig.  5.1,  we  plotted  R  versus  W  for  each  model.  It  is 
seen  that  the  performance  of  Models  1  and  2  is  much  better  than  that 
of  Model  0.  While  none  of  the  wave  fits  of  Model  0  passes  the  0.01 
significance  test,  the  fits  with  W=3,4  and  5  of  Model  1  and  2  are  at 
and  oeyond  the  0.01  significance  level.  Althougn  Model  1  and  2 
perform  equally  well  ,  the  estima  ad  growth  rates  of  the  wave 
amplitudes  in  Model  2  do  not  differ  significantly  from  zero  and,  in 
fact,  their  signs  are  amoiguous  because  their  rms  errors  are  larger 
than  the  estimated  growth  rates  themselves.  The  lack  of  ability  to 
determine  the  growth  rates  is  not  surprising,  however,  because  (1) 
resonant  interactions  should  be  rare  occurences  since  the  forced 
waves  can  grow  if  and  only  if  they  satisfy  the  dispersion 
relationship,  and  (2)  even  if  resonance  actually  ocurrs,  the  time 
scale  of  the  growth,  in  weak-interaction  theory,  is  much  longer  than 
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Model  2 
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Figure  5.1.  Misfits  of  the  3  wave- propagation  models  on  the  data  as 
functions  of  the  nunfcer  of  waves  fitted.  The  measure  of  misfit  Is 
R,  which  a  weighted  sum  of  products  of  final  residuals.  The 
weighting  factors  are  the  reciprocals  of  noise  variances. 


a  wave  period  and  ,  hence,  data  measured  within  a  wave  period  cannot 
be  adequate  for  observing  such  phenomena.  The  reason  for  fitting 
Model  2  to  the  data  is  to  see  if  there  are  any  surprises  that  are 
inconsistent  with  the  theory.  For  Model  1,  the  data  variance 
resolved  increases  by  20  percent  as  W  changes  from  2  to  3  and 
increases  by  as  little  as  5  percent  as  W  further  changes  from  3  to  4 
or  5.  Also,  we  must  keep  in  mind  that  W=4  and  5  correspond  to 
unstable  wave  fits.  Thus,  an  overall  judgement  cl  early  favours 
Model  1  to  be  the  optimal  wave-propagation  model  in  which  a  mean 
flow  is  present  and  W=3  to  be  the  optimal  number  of  propagating 
first  baroclinic  waves. 

To  make  further  assessments,  we  computed  for  each  wave  fit  the 
correlation  coefficient  between  the  ith  independent  data  subset 
and  the  fit,  and  the  amount  of  variance  in  the  ith  data  subset 
resolved  by  the  fit,  ,  using  (4.34)  and  (4.35),  where  i = 1 , 2  and 
3  aenote  the  data  subsets  of  modal  amplitudes,  sc  time  records  and 
st  time  records,  respectively.  For  Model  1,  that  is  the  optimal 
model,  Cj  's  and  1  s  versus  W  are  plotted  in  Figs.  5.2a  and  b, 
respecti  vely.  At  W=3,  i.e.,  the  optimum,  we  obtain  's  of  0.8, 

0.9  and  0.98  and  E^  's  of  78  ,  82  and  96  percent  with  i=l,2  and  3, 
respectively.  There  is  no  inconsistency  although  C-j  and  E3  are 
considerably  larger,  because  a  large  portion  of  the  variance  in  the 
st  time  records  is  resolved  by  the  determination  of  the 
mooring-position  errors  alone.  The  consistently  high  correlations 
and  resolutions  are  a  strong  evidence  of  the  existence  of  three 
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first  Daroclinic  planetary  waves  in  the  tomographic  region  during 
the  experimental  period.  The  optimal  values  of  the  parameters  for 
the  waves  and  mean  flow,  and  their  standard  deviations  (square  roots 
of  the  diagonal  elements  of  H**^)  are  shown  in  Table  5.1.  In  the 
taole,  the  phase  and  group  velocities,  the  Doppler  shifts  and  the 
shifted  periods  themselves,  as  well  as  the  directions  and  lengths  of 
the  waves  are  presented.  Although  the  mean  flew  is  very  weak,  it 
must  be  taken  into  consideration,  since  it  speeds  up  the  phase 
propagation  considerably  by  generating  Doppler  effects;  it  is  thus 
vital  to  the  success  of  the  wave  fit. 


Table  5.1 

The  Optimal  Estimate  Of  The  Wave  Parameters 


(a)  Independent  Wave  Parameters;  the  numbers  behind  the  *  signs  are 
the  standard  deviations 


wave  i.d.  6C  amplitude  wavenumber  vector  phase  constant 
i.d.  no.  Ai  (ny/s)  (1/km)  1  j  (1/km)  ri  (rad.) 


m 


2.28*0.12 
1 .73*0.09 


rim: 


IKQill:] 


..  _  .0203*. 0010  L._ 

-.0066±.0005  -.0198*.0007  1.51*0.12 

-.011 9*. 0005  -.0034*. 0008  -0.06*0.11 


mode  no.  mean-current  modal-ampl  itude  vector  zero- reference  sc 

m  u0  (cm/s)  v0  (cn^s)  b0  (nv's) 


M 


-0.76*0.13 


Bill 


0.39*0.09 


■1.46*0.21 


(b)  Dependent  Wave  Parameters 


wave  wave  direction 

i.d.  no.  length  of  phase 

i  (km)  (degree) 


wave  Doppler  shift 
period  period 
(days)  (days) 


The  seven  observed  time  records  of  «c  are  plotted  in  Fig.  5.3 to 

5.9  together  with  the  optimal  fits.  It  is  seen  that  the 
observations  and  the  optimal  interpolations  compare  favorably. 
Furthermore,  some  secondary  perturbation  with  a  period  of  about  20 
days  superimposed  on  the  primary  perturbations  created  by  the  3 
linear  dispersive  waves  are  found  consistently  in  all  the  time 
records.  The  secondary  oscillations  were  most  profound  at  the 
mooring  site  E2,  i.e.,  at  (x  ,y  )={  150 .7  ,13.6)  km.  Because  the 
frequency  is  below  the  inertial  frequency,  this  oscillation  cannot 
be  due  to  internal  waves;  we  speculate  that  the  secondary 
perturbations  were  caused  by  the  forced  waves  that  oscillate  at 
frequencies  equal  to  the  sum  of  the  frequencies  of  the  interacting 
oarotropic  and/or  baroclinic  waves. 

To  demonstrate  that  the  observed  pattern  of  the  fairly 
complicated  system  can  indeed  be  reconstructed  accurately  by  the 
gradual  evolution  of  three  waves,  we  show  a  time  sequence  of  the 
estimated  and  surveyed  sound-speed  maps  at  a  depth  of  700  m  in  Fig. 

5.10  to  5.16.  The  average  sound-speed  at  that  depth  is  1506  n^s. 
The  estimated  perturbed  sound  speed  on  yearday  66  ,  83  ,  102,  120  and 
137  are  contoured  in  Fig.  5.10,  5.12,  5.13,  5.14  and  5.16,  and  the 
observed  sound  speed  from  the  first  and  second  CTD  serveys  were 
contoured  in  Fig.  5.11  and  5.15,  respectively.  It  is  seen  that  the 
waves  generated  a  trough  that  was  moving  slowly  to  the  west  and  then 
produced  a  front  that  was  advancing  from  the  northeast  during  the 
later  period. 
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Figure  5.3.  Comparison  of  the  optimal  wave  fit  ( _ )  with  the 

sound-speed  perturbation  time  series  (  +  +  +)  observed  from  the 
temperature  sensor  located  at  x»{  145 .7  ,145.0, -.62  6  5)  km. 
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Figure  5.4.  Comparison  of  the  optimal  wave  fit  ( _ )  with  the 

sound- speed  perturbation  time  series  (♦  +  +)  observed  from  the 
temperature  sensor  located  at  x»(145 .7 ,145.0, -.8350)  km. 


SOUND  SPEED  PERTURBATION  (m/< 


060 


00 


:■>  00  =>0  00  30  00 


100  00  no  00  120  00  1 3 0  00  no  00 


TIME  i  DAY)  at  (150  7.  13  6.-  4489}  KM 


Figure  5.5.  Comparison  of  the  optimal  wave  fit  ( _ )  with  the 

sound-speed  perturbation  time  series  (♦  +  +)  observed  from  the 
temperature  sensor  located  at  x*{  150.7,  13. 6, -.4489)  km. 
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Figure  5.7.  Company  of  the  optimal  wave  fit  (  )  with  tl 

sound-speed  perturbation  time  series  (♦  ♦  ♦)  observed  from  the 
temperature  sensor  located  at  x-(150.7,  13.6, -.6937)  km 
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Figure  5.8.  Comparison  of  the  optimal  wave  fit  ( _ )  with  the 


sound-speed  perturbation  time  series  (  +  +  +)  observed  from  the 
temperature  sensor  located  at  x*(  18.9,  91 .5, -.8678)  km. 
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Figure  5.9.  Comparison  of  the  optimal  wave  fit  ( _ )  with  the 

sound-speed  perturbation  time  series  (♦  +  +)  observed  from  the 
temperature  sensor  located  at  x»{  18.9,  91.5,-1.1176)  km. 


Figure  5.10.  Sound-speed  map  at  a  depth  of  700  m  of  the  optimally 
estimated  wave  field  in  the  experimental  square  on  yearday  66. 
Contour  Interval  is  1  nv^s  and  the  reference  sound- speed  at  this 
depth  is  1506  s. 
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Figure  5.11.  Sound- speed  at  a  depth  of  700  m  in  the  experimental 
square,  mapped  by  the  1st  CTD  survey.  The  survey  began  on  yearday 
66  and  ended  on  yearday  83.  Contour  Interval  Is  1  m/s  and  the 
reference  sound-speed  at  this  depth  Is  1506  m/s. 


Figure  5.12.  Sound-speed  map  at  a  depth  of  700  m  of  the  optimally 
estimated  wave  field  In  the  experimental  square  on  yearday  83. 
Contour  Interval  Is  1  nv^s  and  the  reference  sound-speed  at  this 
depth  Is  1506  iVs. 


Figure  5.13.  Sound-speed  map  at  a  depth  of  700  m  of  the  optimally 
estimated  wave  field  in  the  experimental  square  on  yearday  102. 
Contour  Interval  Is  1  iVs  and  the  reference  sound-speed  at  this 
depth  is  1506  iV  s. 


Figure  5.14.  Sound-speed  map  at  a  depth  of  700  m  of  the  optimally 
estimated  wave  field  in  the  experimental  square  on  yearday  120. 
Contour  interval  is  1  m/s  and  the  reference  sound-speed  at  this 
depth  is  1506  ny's. 


Figure  5.15.  Sound-speed  at  a  depth  of  700  m  in  the  experimental 
square,  mapped  by  the  2nd  CTD  survey.  The  survey  began  on  yearday 
120  and  ended  on  yearday  137.  Contour  interval  is  1  nVs  and  the 
reference  sound-speed  at  this  depth  is  1506  n/s. 
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Figure  5.16.  Sound-speed  map  at  a  depth  of  700  m  of  the  optimally 
estimated  wave  field  in  the  experimental  square  on  yearday  137. 
Contour  interval  is  1  n^s  and  the  reference  sound-speed  at  this 
depth  is  1506  nVs. 


The  covariance  matrix  H*  of  the  wave-parameter  estimate 
g*  (i  .e.  the  corresponding  block  in  the  inverse  Hessian  matrix  of 
the  objective  function  evaluated  at  the  minimum  point)  gives 
indications  on  which  wave  parameters,  or  linear  combinations  of  wave 
parameters,  are  well  determined,  and  which  are  poorly  determined.  A 
simple  measure  of  the  quality  of  the  estimate  is  given  by  the 
diagonal  elements  of  H*~^,  which  are  the  variances  of  the 
errors  in  the  estimate;  the  standard  deviations  are  listed  in  Table 
5.1a.  However,  the  presence  of  nonzero  off-diagonal  elements 
implies  that  the  errors  are  correlated,  and  a  full  description  of 
the  error  structure  must  take  all  the  elements  of  the  matrix  into 
account.  As  discussed  in  Sec.  4.6,  a  full  description  may  be 

obtained  by  finding  the  eigenvalue  decomposition  of  H*_*  such 

-IT  ^ 

that  H*^  =U  D  U  ,  where  D  is  the  diagonal  matrix  containing 

the  eigenvalues  and  U  is  the  matrix  containing  the  eigenvectors  in 

its  columns,  so  that  new  variables  defined  by  £*  =UTp  and 

representing  a  set  of  linear  combinations  of  the  wave  parameters 

would  have  uncorrelated  errors  in  their  estimate  £'*=UT£*.  The 

error  variance  of  £'*  is  D.  We  have  performed  the  decomposition  and 

found  the  set  of  linear  combinations  of  wave  parameters.  We  have 

found  that  all  the  17  linear  combinations  were  well  determined.  The 

difference  between  the  variances  of  the  best  and  the  worst 

determined  linear  combinations  is  small.  The  17  linear  combinations 

will  not  be  listed  since  they  serve  no  further  purpose  in  this 


investigation. 


Finally,  it  is  desirable  to  obtain  an  error  estimate  of  the 
estimated  sound-speed  perturbation  6C*=6Cm(jl  )  due  to  the  error 
A£*  of  the  estimated  wave  parameters  £*.  Through  a  linearization  of 
the  wave  and  mean- flow  induced  sound- speed  perturbation  scm(£) 
about  £*,  the  error  of  sc*  can  be  approximated  by 


:*  ~  ]T  a£*. 

3£ 


(5.2) 


It  then  follows  that  the  variance  of  ac*  can  be  written  as 


<aac*2>  -  £ ascm (£* )  j  H*-l[  3<5cm(£*)  ^ 


(5.3) 


where  is  the  covariance  matrix  of  A£*.  In  Fig.  5.17,  5.18 
and  5.19,  we  show  the  contour  plots  of  the  standard  deviation  of  sc* 
(i.e.  the  square  root  of  (5.3))  at  a  depth  of  700  m  on  yeardays  83, 
102  and  120,  respecti vely.  Because  the  densities  of  the  ray  paths 
and  the  CTD  stations  were  much  higher  in  the  middle  of  the  area,  the 
errors  are  smaller  there.  Furthermore,  since  there  was  an 
enviromental  mooring  E2  on  the  southern  boundary,  the  errors  near 
this  boundary  is  smaller  than  those  near  the  northern  boundary  where 
no  enviromental  mooring  was  deployed  (see  Fig.  3.1).  The  constraint 
imposed  by  the  wave  dynamics  had  introduced  a  high  correlation 


Figure  5.17.  Error  map,  showing  contours  of  the  standard  deviation 
at  a  depth  of  700  m  of  the  optimally  estimated  sound- speed 
perturbations  in  the  wave  field  in  the  experimental  square  on 
yearday  83.  Contour  interval  is  0.05  iVs. 


Ffgure  5.19.  Error  map,  showing  contours  of  the  standard  deviation 
at  a  depth  of  700  m  of  the  optimally  estimated  sound-speed 
perturbations  In  the  wave  field  In  the  experimental  square  on 
yearday  120.  Contour  Interval  Is  0.05  m/s. 


CHAPTER  6 


ESTIMATION  OF  WAVE  PARAMETERS  AND  WAVE  DYNAMICS  (2): 

DISCUSSION  AND  CONCLUSIONS 

6.1  Summary  Of  The  Wave  Fits 

Using  estimation  theory  and  optimization  techniques,  we  have 
studied  the  existence  and  dynamics  of  dispersive  baroclinic 
planetary  waves.  The  estimations  were  based  on  the  profile,  point 
and  integral  measurements  of  sound-speed  (or  temperature) 
perturbations  obtained  in  the  1981  Ocean  Tomography  Experiment. 
Maximum  Likelihood  estimators  that  correspond  to  least- square 
fitting  were  employed.  Many  other  commonly  used  estimation  or 
inversion  methods  are  analogous  to  the  Maximum  Likelihood  method  and 
tiie  technique  of  1  east- squares,  that  is  the  generalized  estimation 
or  inversion  procedure  is  the  minimization  of  an  objective  function 
of  a  weighted  sum  of  products  of  residuals  as  discussed  in  Ch.  4. 

A  range  of  one  to  five  waves  that  propagate  according  to  three 
plausible  models  were  fitted  to  the  data.  The  properties  of  the 
different  wave  fits  were  then  compared  so  that  the  most  consistent 
propagation  model  could  be  identified  and  the  optimal  number  of 
existing  waves  could  be  estimated.  The  data  set  used  in  the 
fittings  was  derived  from  the  measurements  through  filtering  and 
data  reduction. 


The  'best'  fit  can  unambi gously  be  identified  to  correspond  to 
three  waves  that  evolved  under  the  presence  of  a  mean  flow.  The 
evidence  of  the  existence  of  the  waves  is  supported  by  the  high 
correlation  between  the  fit  and  the  observed  signal  (>0.88)  and  the 
large  amount  of  signal  energy  resolved  (^78  percent),  in  each  of  the 
three  independent  data  subsets.  Furthermore,  the  high  correlations 
and  resolution  cannot  be  a  result  of  ill-conditioning  in  the  system 
of  model  equations  because  the  optimal  solution  for  the  wave 
parameters  is  unique  and  well-determined.  As  indicated  in  Table 
5.1,  the  rms  errors  are  only  about  10  percent  of  the  estimate. 


6.2  C orrnien ts  On  The  Wave  Dynamics 

Westward  phase  propagation  is  known  to  be  typical  of  mesoscale 
perturbations  at  mid- latitudes  from  previous  experiments. 
Consistently,  as  indicated  in  Table  5.1,  the  phases  of  the  observed 
waves  were  all  propagating  westward.  The  corresponding 
group- velocity  vectors  have  westward  directions  also,  implying  that 
the  waves  were  generated  somewhere  to  the  east  of  the  experimental 
region,  therefore,  the  possibility  that  they  were  radiated  by  the 
intense  Gulf  Stream  can  be  ruled  out.  The  three  baroclinic  waves  do 
not  form  a  resonant  triad  since  the  sum  or  difference  of  the  phases 
of  two  of  the  waves  does  not  equal  the  phase  of  the  other  wave. 
However,  the  propagation  of  resonant  baroclinic  waves  is  still 
possible  because  they  could  be  generated  by  interacting  barotropic 
waves.  The  fastest  oscillation  that  could  be  forced  by  the  observed 
baroclinic  waves  would  result  from  the  interaction  between  the  1st 
and  the  3rd  waves  and  would  have  a  period  of  ( 1/117+ 1/121)_1~60 
days.  But,  since  the  secondary  perturbation  which  we  have  observed 
from  the  moored  time  records  of  temperature  has  a  period  of  20  days 
(see  Fig.  5.3  to  5.9),  it  must  be  due  to  the  interaction  of 
barotropic  waves  that  have  much  higher  frequency  cutoffs. 

In  the  absence  of  a  mean  flow,  the  short-period  cutoff  of 
first-mode  baroclinic  waves  is  approximately  160  days,  e.g.  (2.52), 
so  that  the  waves  cannot  account  for  the  high  frequency  content 
(i.e.  periods  of  117  and  121  days)  of  the  data.  This  is  well 
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demonstrated  by  the  wave  fits  of  Model  0.  Although  the  mean 
current,  as  estimated,  is  very  weak  (approximately  2  cm/s),  it 
strongly  alters  the  space  and  time  behavior  of  the  wave-induced 
perturbations  by  producing  large  changes  in  the  wave  periods  or 
frequencies  (the  Doppler  effects  have  reduced  the  wave  periods  of 
the  3  waves  by  202,  164  and  77  days,  respectively).  Thus,  the  weak 
mean  current  has  played  an  important  role  on  the  wave  propagation  in 
the  region. 

The  approximate  solution  for  linear  dispersive  planetary  waves 
is  obtained  by  neglecting  the  nonlinear  and  Tinear-coupl  ing  terms  in 
the  horizontal- structure  equations  (2.43)  for  mesoscale  motions. 

Let  us  first  comment  on  the  linearization  and  then  discuss  the 
linear  coupling  in  the  context  of  instability  theory. 

Qualitatively,  the  1  inearization  is  valid  when  the  ratio  of  the 
particle  to  phase  speed  of  the  waves  is  small  when  compared  to 
unity.  As  the  ratio  decreases,  so  do  the  nonlinear  effects. 
Therefore,  by  shortening  the  wave  periods  and  hence  increasing  the 
ihase  velocities,  a  westward  mean  current  can  weaken  the  nonlinear 
interactions  between  the  dispersive  waves,  thus  making  the  linear 
approximation  better.  The  magnitudes  of  the  phase  and  particle 
velocities  of  the  observed  dispersive  primary  waves  were  computed 
and  the  results  are  presented  in  Table  6.1a.  Furthermore,  the 
magnitudes  of  the  phase  velocities  of  the  waves,  computed  as  if  the 
mean  current  were  absent,  are  also  presented  in  the  same  table.  It 
is  seen  that  if  the  weak  mean  current  were  absent,  the  validity  of 


the  linearization  for  the  wave  motions  would  be  harder  to  justify. 
The  pressure  amplitudes  of  the  secondary  waves  forced  by  the 
observed  primary  waves,  computed  using  (2.66),  and  the  pressure 
amplitudes  of  the  primary  waves  themselves  are  given  in  Table  6.1b. 
The  ratios  of  the  rms  pressure  amplitudes  of  the  secondary  to  the 
primary  waves  are  approximately  1/4.  Thus,  there  could  be  upto  a  25 
percent  error  in  the  linearized  wave  solution. 
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Magnitudes  of  the  phase  and  particle  velocities  of  the  primary 
dispersive  waves;  the  phase  speeds  in  parentheses  were  computed  by 
setting  the  mean  current  to  zero. 
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Since  we  have  observed  a  horizontally  stratified  flow  with 
vertical  shear,  we  shall  investigate  the  stabilitity  of  the  flow  in 
the  presence  of  wave  disturbances.  The  corresponding  instability 
phenomenon  is  baroclinic.  Idien  it  occurs,  the  available  potential 
energy  of  the  sloping-isopycnal  mean  state  is  converted  to  the 
potential  and  kinetic  energy  of  the  perturbations.  A  consequence  of 
baroclinic  instability  is  that  the  wave  disturbances  will  grow  and 
the  tilted  mean- state  isopycnal  surfaces  will  become  more 
horizontal ,  that  is  warm  fluid  will  rise  and  cold  fluid  will  sink. 
Another  instability  phenomenon,  which  is  not  considered  here,  is 
barotropic  in  which  the  kinetic  energy  of  the  mean  flow  is  converted 
to  tiie  Kinetic  energy  of  the  perturbations.  Barotropic  instability 
can  only  occur  if  the  mean  flow  has  a  horizontal  shear.  The 
interested  reader  is  referred  to  Pedlosky  (1979)  and  LeBlond  and 
Mysak  (197d)  for  discussion  on  both  barotropic  and  baroclinic 
instab il  ities. 

Mathematically,  the  linear  couplings  in  (2.43)  give  rise  to 
baroclinic  instability.  Assuming  the  ocean  bottom  is  flat,  dropping 
the  nonlinear  terms,  and  performing  a  triple  Fourier  transformation, 
(2.43)  can  be  cast  as  an  eigenvalue  problem  in  matrix  algebra: 
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( 6 .1  e) 


where  em=  1-932  is  evaluated  by  (2.36b),  x^=5 .149x10”^  km"2 
is  the  inverse  of  the  internal  radius  of  deformation  of  the  1st  mode 
squared  and  j6q ( k ,  1  ,<7)  and  ^(k.l  ,0)  are  the  spectra  of  the 
modal-amplitude  functions  of  the  1st  and  2nd  mode  perturbation 
pressures,  respectively,  as  defined  in  (2.47).  The  modal -ampl i tude 
vectors  of  the  barotropic  and  baroclinic  mean  currents  are  denoted 
by  (Uq,vq)  and  (Uj.Vj),  respectively.  For  a  given 
wavenumber  vector  (k ,1 ) ,  the  wavefrequencies  a,  that  is  the 
eigenvalues,  are  given  by 
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at(k,l)={a11+a22)/2  *  [(a11-a22)2+4a21a12]1/2/2 

(6.2) 

Note  that  the  coupling  is  caused  by  the  baroclinic  mean  current 
only,  and  when  coupling  is  neglected  o_  and  o+  are  the  same  as 
the  frequencies  of  the  dispersive  barotropic  and  baroclinic  waves, 
respectively.  For  disturbances  with  (k,l)'s  satisfying 

2 

( all"a22 ^  ^  ~^a12a21 ’  (6.3) 

the  wavefrequencies  are  complex.  Under  this  condition,  since  a+ 
and  o_  are  complex  conjugates,  one  of  them  must  have  a  positive 
imaginary  part  that  corresponds  to  instability. 

To  investigate  whether  the  observed  waves  are  unstable,  we 
solved  (6.3)  for  the  region  of  instability  in  the  wavenumber  domain, 
i.e.,  k-1  plane,  using  the  estimated  values  of  (Ug,vQ)  and 
(u^v^.  The  shaded  area  on  the  k-1  plane  as  displayed  in 
Fig.  6.1  is  the  region  of  instability.  In  the  figure,  we  also  plot 
the  locations  of  the  observed  wave  disturbances.  The  disturbances 
are  all  located  outside  the  shaded  area,  implying  that  the  waves  are 
stable,  at  least  in  the  general  area  where  the  experiment  was 
conducted.  However,  we  must  warn  that,  as  the  waves  approach  the 
western  boundary,  they  may  encounter  changes  in  the  direction  and 
intensity  of  the  mean  flow  such  that  some  or  all  of  the  three  waves 
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Figure  6.1.  The  stability  of  the  vertically  sheared  mean  flow  In 
the  tomographic  region  In  the  presence  of  wave  disturbances  In  the 
first  barocllnlc  mode.  The  region  of  Instability  on  the 
(k,l)-plane,  l.e.  In  the  wavenumber  domain,  Is  the  shaded  area,  and 
the  wavenumber  vectors  of  the  observed  disturbances  (+)  are  In  the 
stable  region. 
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can  oecome  unstable  and  develop  Into  Intense  eddies.  This  is 
because,  as  the  mean  current  becomes  stronger,  the  region  of 
instability  becomes  larger;  also  as  the  flow  direction  changes,  so 
does  the  location  of  the  unstable  region. 

In  spite  of  the  fact  no  inconsistency  between  the  theory  and 
observations  has  been  found,  we  recognize  that  a  complete 
investigation  of  the  wave  dynamics  was  disallowed  by  the  limitations 
imposed  by  the  data.  First,  we  were  unable  to  observe  any  weakly 
nonlinear  phenomenon  of  the  baroclinic  perturbations  because  the 
data  occupy  a  time  interval  which  is  less  than  one  wave  period. 
Secondly,  due  to  the  insufficiency  of  explicit  current  measurements, 
we  were  unable  to  observe  the  barotropic  waves. 


6.3  Comparison  With  The  MODE  Wave  Fits 


The  Mid-Ocean  Dynamics  Experiments,  MQDE-0  and  MODE-1,  were 
designed  to  investigate  mesoscale  motions  and  their  role  in  the 
general  circulation  in  a  ~400  km  square  region  at  28°N,  69°40'W, 
just  north  of  the  tomographic  region.  MODE-O  was  a  collection  of 
several  pilot  studies  that  were  carried  out  between  1971  and  1972  to 
identify  the  space  and  time  scales  of  the  energies.  It  was  then 
followed  by  MODE-1  in  the  spring  of  1973,  which  is  probably  the  most 
comprehensive  large-scale  experiment  to  date.  MODE-1  lasted  for 
about  4  months. 

McWilliams  and  FI  ierl  (1976)  have  fitted  the  pi anetary-wave 
model  to  the  MODE-O  and  MODE-1  data  sets,  separately.  While  the 
former  contained  only  current-meter  records  from  7  separate  moorings 
and  mostly  from  beneath  the  main  thermocline,  the  latter  was  a  much 
larger  and  more  uniform  data  set,  obtained  from  a  variety  of 
instruments:  current  meters,  moored  temperature  sensors,  CTD's  and 
STD's,  and  SOFAR  floats.  The  MODE-O  and  MODE-1  data  sets  have 
durations  of  3  and  4  months,  respectively.  In  constrast  to  the 
observational  system  deployed  in  the  tomographic  experiment,  the 
MODE  arrays  consisted  of  spot  measurements  only,  which  unlike  the 
acoustic  travel-time  measurements,  could  be  severely  contaminated  by 
undesirable  small-scale  features. 

In  the  same  way  as  our  study  but  for  weighting,  McWilliams  and 
Flier!  chose  the  optimal  wave  parameters  to  minimize  a  quadratic 


error  norm  for  the  differences  between  the  data  and  the  fit. 

Instead  of  specifying  the  weighting  factors  in  the  error-norm 
minimization  according  to  the  noise  variances,  as  was  done  in  our 
stochastic  estimates,  they  have  assigned  equal  weighting  to  each 
datum  of  the  same  type  and  made  total  data  energies  equal  for  all 
types  when  incorporating  data  of  different  types.  Under  the 
circumstances,  we  believe  that  their  estimates  do  not  differ 
significantly  from  the  stochastic  maximum-likelihood  estimates. 
Decause  estimates  are,  in  general,  not  sensitive  to  the  choice  of 
weignting  factors  when  the  number  of  data  is  much  larger  than  the 
number  of  unknown  parameters. 

The  best  MODE-O  fit  has  a  high  correlation  of  ~0.8  with  the  data 
and  accounted  for  over  half  (~60  percent)  of  the  data  energy.  It 
consisted  of  a  pair  of  barotropic  waves  and  no  baroclinic  waves, 
propagating  in  the  absence  of  mean  flow.  The  reason  for  not  being 
able  to  observe  any  baroclinic  waves  is  probably  that  MODE-O  was 
primarily  an  experiment  of  the  lower  layer  (below  the  main 
thermocline)  where  the  barotropic-mode  kinetic  energy  dominates. 
Altnough  a  few  current-meter  records  from  the  main  thermocline  were 
available,  they  came  from  only  two  horizontal  locations.  Therefore, 
they  were  not  adequate  for  resolving  baroclinic  waves,  since  each 
wave  involves  at  least  4  free  parameters.  In  contrast,  the 
tomographic  experiment  was  primarily  for  the  observations  of  the 
baroclinic  modes.  In  the  experiment,  the  acoustic  array,  the  CTD 
casts  and  the  moored  temperature  sensors  and  recorders ,  all  probed 
the  temperature  field  in  which  the  barocl  inic-mode  effects  dominate, 


and  for  the  same  reasons  as  above,  the  current  data  from  only  2 
horizontal  locations  were  inadequate  for  resolving  barotropic  waves. 

Nonl  inear  interactions  within  the  MODE-O  wave  fit  and  our  wave 
fit  to  the  data  of  the  tomographic  experiment  were  found  to  be  weak: 
forced  wave  amplitudes  were  predicted  by  the  weakly  nonlinear  theory 
to  be  about  20  percent  of  the  primary  wave  amplitudes.  Thus  both 
sets  of  waves  represent  fully  consistent  linear  solutions. 

On  the  other  hand,  both  barotropic  and  baroclinic  waves  were 
observable  by  the  MODE-1  array  that  contained  both  adequate  current 
and  temperature  measurements.  The  best  MODE-1  fit,  having  a 
correlation  of  ~0.7  and  accounted  for  1/2  of  the  data  energy,  has  a 
pair  of  barotropic  and  a  pair  of  first-barocl  inic  waves.  Consistent 
with  the  MODE-O  fit,  no  significant  energy  of  the  mean  flow  was 
found.  However,  unlike  the  other  two  fits,  nonlinear  interactions 
were  found  to  be  of  marginal  but  uncertain  importance  within  the 
MODE-1  fit:  forced  wave  amplitudes  were  predicted  to  be  large  and 
comparable  to  the  primary  wave  amplitudes.  But,  by  searching  in  the 
data  for  the  forced  waves  with  the  given  frequencies  and 
wavenumbers,  McWilliam  and  Flierl  have  found  no  significant  energy 
in  them.  To  explain  this,  McWilliams  and  Flierl  suggested  that  the 
nonlinear  transfers  of  energy  might  have  acted  in  such  a  way  as  to 
preserve  crucial  features  of  the  linear  solution,  empirically. 

From  the  results  of  the  3  wave  fits,  we  can  summarize  the 
dynamics  of  the  mesoscale  motion  in  the  general  area  where  MODE-O, 
MODE-1  and  the  tomographic  experiment  were  conducted  as  follow: 


(1)  The  motion  appears  to  be  dominantly  wave-like:  planetary 
waves  have  consistently  accounted  for  more  than  and  about  1/2  of  the 
total  signal  energies  observed  in  different  places  and  during 
different  time  periods. 

(2)  The  vertical  structure  is  dominated  by  the  barotropic  and 
the  first  baroclinic  modes,  with  the  latter  containing  the  greatest 
fraction  of  the  available  potential  energy  among  all  the  vertical 
modes . 

(3)  Locally,  the  space-time  behavior  of  the  motion  is  well 
predicted  by  the  dispersion  relation,  i.e.  linear  dynamics.  But,  as 
the  lengths  in  space  and  time  considered  increase,  the  linear 
prediction  becomes  less  accurate;  this  is  demonstrated  by  the  fact 
that  the  MODE  wave  fits,  which  involved  a  larger  region  and  1  onger 
durations,  have  poorer  quality  (i.e.  smaller  correlations  and  less 
signals  accounted  for)  than  our  wave  fit.  Thus,  planetary  wave 
propagation  is  strictly  a  local  phenomenon. 

(4)  Most  of  the  waves  observed  in  the  three  experiments  have 
westward  group  velocities,  suggesting  that  wave  disturbances  are 
originated  in  the  east. 

(b)  The  phase  propagation  is  generally  westward,  and  the  wave 
lengths  of  the  propagating  baroclinic  waves  are  typically  of  order  a 
few  hundreds  of  kilometers. 

(b)  Evidence  exists  for  the  existence  of  a  westward  mean  flow 
with  diminishing  flew  energy  towards  the  north:  a  weak  westward  mean 
flow  with  vertical  shear  was  found  in  the  tomographic  region  and 


vanishing  mean-flow  intensity  was  found  in  the  MODE  region. 

(7)  In  each  of  the  three  experiments,  MODE-O,  MODE-1  and  1981 
Ocean  Tomography,  the  data  exhibited  more  high  frequency  variability 
than  the  wave  fits.  Therefore,  nonlinear  wave-wave  interactions 
must  be  consequences  of  wave  propagation. 

(8)  Stronger  nonlinear  wave-wave  interactions  should  occur  in 
the  north,  because  the  westward  mean  flow  can  reduce  the 
interactions  in  the  south  by  increasing  the  westward  phase 
velocities  there. 
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6.4  Comparison  Of  The  Different  Mapping  Methods 

Previously,  Cornuelle  (1983)  andCornuelle  et  al .  (1985)  have 
used  the  same  acoustic  and  hydrographic  data  to  map  the  ocean. 

Their  mapping,  however,  was  performed  on  an  "objective"  and  "daily" 
basis,  and  the  two  sets  of  data  were  used  separately  in  independent 
linear  inversions.  The  mapping  performed  by  us  differs  from  that  of 
Cornuelle  et  al  .  in  three  major  aspects:  (1)  we  have  incorporated 
the  hydrographic  and  the  moored  temperature  data  together  with  the 
acoustic  data  in  the  same  inversions,  (2)  this  mapping  is 
"subjective"  and  takes  into  account  the  time-dependence  of  the 
field,  and  (3)  the  system  being  solved  here  is  nonlinear  with 
respect  to  the  unknown  parameters.  By  "subjective"  mapping,  as 
oppose  to  "objective"  mapping,  we  mean  that  the  space-time  relation 
imposed  on  the  unknown  field  in  the  inversion  of  data  is  a 
deterministic  one. 

In  this  section,  we  will  first  describe  the  method  of  Cornuelle 
et  al  .  and  discuss  the  differences  and  similarities  between  our 
method  and  theirs.  We  will  then  present  some  possible  extensions  of 
their  method  to  take  into  account  the  time-dependence  of  the  field. 
The  advantages  and  disadvantages  of  the  different  methods  will  also 
be  discussed.  A  discussion  on  the  improvement  on  the  inversion 
result  due  to  the  incorporation  of  the  spot  measurements  will  be 
presented  in  the  next  section.  For  the  purpose  of  making  the 
algebra  as  simple  as  possible  in  this  discussion  but  without  loss  of 


generality,  let  us  assume  that  the  positions  of  the  acoustic 
moorings  are  accurately  known  in  the  following  mathematical 
formulations.  (A  discussion  on  the  effect  of  unknown  mooring  motions 
on  the  estimate  of  <sc  is  presented  separately  in  Ch.  7.) 

Cornuelle  et  al  .  wanted  to  obtained  the  best  possible  estimate 
of  the  perturbation  field  (ScU.t^)  of  sound  speed  in  space 
x  =  (x,y,z)  on  the  days  t  =tk  ' s  of  the  acoustic  transmiss ions ,  based 
on  the  acoutic  data  alone.  They  have  chosen  a  linear  estimator  and 
defined  the  best  estimate  to  have  minimum  variance.  Their  method  of 
inversion  is  analogous  to  the  objective  mapping  of  Bretherton  et  al  . 
(  1976),  in  vrfiich  a  specification  of  the  autocorrelation  function  of 
the  unknown  field  is  required.  Cornuelle  et  al  .  have  assumed  that 
the  unknown  field  6cU,t)  to  be  horizontally  homogeneous  and 
temporally  uncorrel  ated.  Based  upon  the  analysis  of  Richman  et  al  . 

(  1977)  on  the  MODE -array  data,  they  have  taken  the  horizontal 
autocorrel  ation  function  to  be  Gaussian  in  shape  with  a  decay  scale 
of  100  km.  Vertically,  they  have  chosen  to  represent  6C  by  the 
empirical  orthogonal  modes  derived  from  the  MODE -hydrographic  data. 
Thus,  the  correlation  function  can  be  expressed  as 
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i  =  1 , 2, 3,...,  (6.4) 


where  6C-  represents  the  horizontal  structure  of  the  sound-speed 


2  ... 


perturbation  associated  with  the  i  th  mode  and  o,  is  the 


expected  energy  of  ac... 

•  The  tomographic  system  solved  by  Cornuelle  et  al.  is  linear  and 
may  be  cast  parametrically,  at  time  tk,  as 


6t°(tk)  =  A  a(tk)  +  v(tk) 


(6.5) 


where  at°(tk)  is  an  mxl  data  vector  containing  the  observed 
travel-time  perturbations,  v(tk)  represents  the  noise  vector,  and 
a(tk)  is  an  nxl  parameter  vector  to  be  estimated,  containing  the 
unknown  amplitudes  of  the  sinusoidal  wavenumber  components  of 
6C.j .  Unlike  the  other  quantities  in  (6.5),  the  linear  operator  A, 
that  is  an  mxn  weighting  matrix,  is  time- in  dependent,  and  A  can  be 
evaluated  using  (3.8).  Because  ac.j ‘s  are  spatially  homogeneous, 
the  Fourier  components  in  the  wavenumber  spectra  are  uncorrelated, 
implying  that  the  time-independent  covariance  Ca  of  a(t)  is  a 
diagonal  matrix.  Clearly,  an  advantage  of  choosing  to  estimate  a 
instead  of  the  ac^ 's  themselves  is  the  minimization  of  the  storage 
area  required  in  the  computer.  Since  the  system  (6.5)  is  linear  and 
the  sound- speed  perturbation  and  noise  are  uncorrelated,  the 
Gauss-Markov  Theorem  immediately  asserts  that  among  all  linear 
estimates,  the  one  with  the  smallest  variance  is 


a*(tj  =  C  *(tJATc:.1{t,.)at0(tt) 


(6.6) 


is  the  error-covariance  matrix  of  a*!^)  and  Cy(t|()  is  the 
covariance  matrix  of  the  noise  vU^)  (Liebelt,  1967).  An 
interesting  fact  is  that  the  same  estimate  can  be  obtained  by 
maximizing  (minimizing)  the  corresponding  likelihood  (objective) 
function.  This  is  not  surprising,  however,  because  as  we  may  recall 
from  the  discussion  in  Ch.  4,  when  the  system  is  linear  and  the  a 
priori  information  is  incorporated  as  data  in  the  system,  the 
maximum-likelihood  estimate  has  the  lowest  theoretically  attainable 
variance.  Therefore,  an  obvious  similarity  between  the  method  of 
Cornuelle  et  al.  and  ours  is  that  they  both  compute 
maximum- 1  ik el  ihood  estimates.  However,  they  did  not  consider  the 
time-dependence  of  the  field  in  their  inversions;  their  estimates 
thus  were  three-dimensional  ones. 

The  generation  of  a  four-dimensional  estimate  is  more 
desirable.  One  reason  is  that  the  quality  of  the  estimate  of  the 
unknown  field  is  generally  improved  when  the  set  of  observations 
used  in  the  inversion  is  enlarged.  In  the  detection  of  narrow-band 
planetary  waves  from  the  data,  we  have  mapped  the  ocean  on  a 
subjective  and  four-dimensional  basis,  by  imposing  that  the  local 
sound-speed  field  is  predominantly  perturbed  by  the  waves.  That  is 
in  the  inversions,  we  have  required  the  wavenumber  spectra  to  be 
sharply  peaked  at  some  wavenumbers  and  the  spectra  at  different 
times  to  be  related  by  the  dispersion  relationship.  It  is 


understood  from  inverse  theory  that,  if  the  unknown  function  is  an 
impulse,  the  best  linear  estimate  of  the  function  will  generally 
contain  side  lobes  in  addition  to  a  main  lobe  at  the  location  of  the 
impulse  (Wiggins,  1972,  Wunsch,  1978,  etc.).  The  leakage  of  energy 
to  the  side  lopes  and  the  broadening  of  the  main  peak  is  a 
consequence  of  the  lack  of  determining  power  which  is  always 
associated  with  an  under  determined  system.  The  implication  is  that 
narrow- band  planetary  waves  cannot  be  adequately  resolved  by 
directly  estimating  the  parameter  vector  a(t)  that  represents  the 
continuous  spectral-amplitude  functions.  Oneway  to  eliminate  the 
side  lobes  and  sharpen  the  main  lobe  is  to  re  parameterize  the 
wavenumber  spectra  by  the  location,  amplitude  and  phase  of  the 
peaks,  and  tnis  is  exactly  what  we  have  done  to  implement  the 
narrow-band  constraint  in  the  inversions. 

The  narrow-band  constraint  transforms  the  underdetermined  linear 
systems  at  different  tk  into  one  over  determined,  nonlinear 
system.  The  linearization  of  the  nonlinear  system  with  respect  to 
the  unknown  wavenumbers  is  not  valid  because  the  phase  functions  of 
the  waves  can  be  of  order  one  or  bigger  at  large  distances  and 
times,  implying  that  we  cannot  use  standard  direct  techniques  such 
as  Gaussian  elimination  and  the  singular- val  ue  decomposition,  and 
must  resort  to  the  use  of  iterative  minimization  methods  for  the 
inversions.  We  prefer  gradient  methods  over  other  iterative  methods 
because  they  guarantee  convergence  (Ch.  4). 
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The  error  covariance  of  the  estimate  associated  with  the  linear 
system  (b. 5)  does  not  dependent  on  the  data  and  the  estimate  itself, 
out  only  on  the  statistics  of  the  unknown  field  and  noise,  and  the 
geometry  of  the  acoustic  array.  Difficulty  in  the  analysis  of 
variance  increases  once  the  estimation  problem  becomes  nonlinear. 

In  fact,  the  variance  of  our  nonlinear  wave  fit  could  not  be 
obtained  before  the  estimate  was  computed.  Therefore,  in  the  design 
of  tomographic  experiments,  it  is  definitely  more  convenient  to  work 
with  the  linear  systems.  However,  the  wave  fit  accounts  for  the 
dynamics. 

While  Cornuelle  et  al .  have  adopted  the  empirical  modes  (derived 
from  the  MODE  data)  as  the  vertical  basis  of  the  sound-speed 
perturbations,  we  have,  instead,  adopted  the  analytical  modes  of 
Rossby  waves.  An  advantage  in  using  the  analytical  modes  is  that 
the  corresponding  horizontal- structure  equation  can  readily  be 
obtained  from  the  literature.  The  1st  and  3rd  empirical  modes 
strongly  resemble  the  1st  and  3rd  baroclinic  analytical  modes,  and 
the  2nd  empirical  mode  is  strongly  surface-intensified.  (The 
empirical  modes  are  ordered  according  to  the  ratios  of  their 
potential  energy,  with  the  most  energetic  one  being  defined  as  the 
1st  mode.)  The  first  four  empirical  modes  were  used  by  Cornuelle  et 
al .  (1985)  and  assigned  equal  energy  a  priori;  however,  their 
inversions  have  yielded  a  result  of  1  :  0.1  :  0.05  for  the  ratios  of 
the  energy  of  the  first  three  modes,  showing  consistently  that  the 
vertical  structure  is  dominated  by  the  1st  baroclinic  mode. 


Moreover,  the  amplitudes  of  the  higher  modes  were  poorly  determined, 
because  most  of  the  ray  paths  identified  did  not  penetrate  into  the 
mixed  layer  to  sense  the  surface- intensi fied  mode,  and  the  other 
nigner  modes  are  basically  transparent  to  acoustic  tomography  (see 
Ch.  3  for  the  discussion).  Our  Inversions,  therefore,  have  not 
attained  a  poorer  vertical  resolution  although  only  the  1st 
Daroclinic  analytical  mode  was  used. 

In  oojective  mapping,  the  experimental  noise  basically  consists 
of  the  measurement  and  internal-wave  related  errors  that  generally 
have  zero  expected  values.  However,  in  subjective  mapping,  the 
additional  error  introduced  by  the  idealizations  and  assumptions 
usea  in  building  the  dynamical  model  may  have  a  nonzero  statistical 
average.  A  consequence  of  the  zero-mean  hypothesis  on  the  errors 
that  in  reality  have  nonzero  expected  values  is  the  generation  of 
bias  error  in  the  estimate.  To  illustrate  this,  let  us  suppose  that 
the  model  equations  «t°=f.(g.)+v'  associated  with  a  pure  acoustic 
detection  of  narrow-band  planetary  waves  can  be  linearized  about  the 
true  values  of  the  wave  parameters  £,  so  that  the  expectations 
of  «t°,  £  and  v*  are  related  by,  approximately, 

<£t°>  =  £t  +  <v’>  •  (6.8) 

3£ 

After  solving  the  linearized  system  and  then  using  (6.8),  the 
expectation  of  the  maximum-likelihood  estimate  £*  can  be  written 


approximately  as 


<£*>  =  +  b  (6.9a) 

where 

b.  L(^t))TC'}(^t,)]'1[^t)]V!<v'>  (6.9b) 

3£  -  3£  3£ 

is  the  bias  of  the  estimate  and  Cy,  is  the  covariance  of  v'  . 
Clearly,  the  bias  exits  when  <v'>  is  not  zero. 

In  spite  of  the  generation  of  bias  in  the  estimate,  subjective 
mapping  has  its  appeal.  By  trying  many  different  dynamical  models 
in  the  inversions,  the  data  can  make  diagnoses  for  plausible 
dynamics.  Hence,  one  can  learn  the  dynamics  of  the  field  directly 
from  the  Inversions  and  then  use  the  knowledge  gained  to  make  model 
corrections.  In  fact,  the  generation  of  bias  is  not  of  major 
concern,  since  when  the  model  used  is  accurate,  the  bias  will  be 
small.  Moreover,  the  estimate  generated  by  objective  mapping  is 
also  Diased.  Using  (6.5),  (6.6)  and  (6.7),  we  can  easily  show  that 
the  expectation  of  the  objective  estimate  of  a ( t^ )  is  given  by 


<a*(tk)>  =  [A  TC~y  1  ( tk )A+C" 1  ]'  Vc;1  ( tk ) A  a t( tk  )  (6.10) 


where  at(tk)  is  the  true  value  of  a(tt).  Clearly,  the 


objective  estimate  is  biased,  1  .e.  <a^(tk)>^at(tk) ,  unless  no 
a  priori  information  is  asserted,  that  is  unless  the  a  priori 
covariance  approaches  infinity.  But,  if  approaches 
infinity,  so  will  the  error  covariance  CAa*  of  a*  (which  is 
expressed  in  (6.7)),  because  A^C^ft^jA  is  singular.  As  a 
matter  of  fact,  sufficient  a  priori  information  must  be  supplied  to 
generate  enough  bias  to  ensure  the  stability  of  the  Inversion. 

The  inversion  method  of  Cornuelle  et  al.,  which  uses  the  linear 
minimum-variance  criterion  for  the  estimates,  in  principle,  can  be 
modified  to  become  four-dimensional.  An  objective  approach  is  the 
implementation  of  the  time  correlation  of  the  field  into  (6.4)  and 
the  expansion  of  system  (6.5)  to  include  observations  at  other 
times.  Let  us  suppose  that  there  are  N+l  equally  spaced  data  points 
in  each  time  record  of  travel -time  perturbation,  so  that  the 
expanded  system  can  be  cast  as 

at'0  =  A'a'  +  v1  (6.11a) 


where 
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(6.11b) 


Once  again,  the  linear  minimum- variance  estimate  a'*  of  a'  can  be 
found  by  applying  the  Gauss-Markov  Theorem,  at  least  in  theory. 


However,  the  implementation  of  the  estimation  procedure  on  available 
computing  machinery  may  not  be  feasible,  since  the  storage 
requirements  for  the  covariance  matrices  Ca.  of  a'  and  CAa.*  of 
a'*  can  be  large  and  thus  the  computation  of  a'*  might  be  too 
costly.  To  obtain  a'*,  we  need  to  evaluate  its  (error)  covariance  by 

CA.,*  •  (A^C-fA'+Clfr1.  (6.12a) 

— Aa  —  —V  —  —a 

or  equivalently,  as  shown  in  Liebelt  (1967),  by 

CA  .*  ■  C^. -(C  .A,T)(A,C_.A,T+CV,  )"1(cj,, a’T)T.  (6.12b) 

Because  the  system  is  highly  under  determined,  the  latter  formula 
(6.9b),  which  involves  the  computation  of  the  inverse  of  a  smaller 
matrix  should  be  used;  the  inversion  of  this  m(N+l)xm(N+l)  matrix 
would  consume  the  largest  portion  of  the  total  computer  time 
required  to  produce  the  estimate.  Since  the  time  required  to 
perform  a  matrix- Inverse  operation  is  approximately  proportional  to 
the  cube  of  the  row  (or  column)  dimension  of  the  matrix  (Dahlquist 
and  Bjorck,  1974),  this  four-dimensional  objective  mapping  can  be 
very  inefficient  for  large  N. 

An  alternative  approach,  which  Is  subjective,  is  to  impose  a 
deterministic  relation  instead  of  a  statistical  correlation  between 
the  perturbati ons  or  the  wavenumber  spectra  at  different  times.  In 


this  case,  a  linear,  mini  mum -variance ,  four-dimensional  estimate  can 
also  be  obtained  if  the  dynamical  relation  is  linear  or  can  be 
closely  approximated  by  a  linearization  at  all  time  steps,  such  that 

a(tkn)  =  ^a (tk  ) ;  k=0,l,2 . N-l.  (6.13) 

With  the  presence  of  the  dynamical  relation  (6.13),  the  number  of 
independent  or  free  parameters  in  (6.11)  Is  drastically  reduced,  and 
one  can  choose  the  unknown  to  be  the  Initial  spectral  amplitudes 
a_(tQ)  or  the  spectral  amplitudes  at  any  other  time.  As  a  result, 
the  covariance  matrices  are  nolonger  overly  large.  Furthermore,  the 
linear,  minimum-variance,  subjective  estimate  can  be  computed  using 
an  accelerated  algorithm  for  a  Kalman  filter  that  corresponds  to  a 
sequence  of  predictions  and  reestimations  at  each  of  the  time  steps 
(Gelb  et  al .,  1974),  so  that  an  abundance  of  computation  time  can  be 
saved.  In  (6.13),  the  Dj^'s  are  often  called  the  transition 
ma  tri ces . 

A  derivation  of  the  sequential -reestimation  algorithm  through 
the  minimization  of  the  corresponding  objective  function  is 
presented  in  Appendix,  and  we  will  demonstrate  the  superior 
efficiency  of  this  algorithm  next.  In  Appendix,  we  show  that,  by 
choosing  ajt^)  to  be  the  free  parameters,  the  optimal  estimate 
aMt^)  of  ajtjg)  can  be  obtained  by  sequentially  computing 


-  ^l+l^  = -1  +  1^-- ^1+1  ^i  +  i^+Sa  ^1+1^-  ^1+1^ 


(6.14a) 


in  order  of  increasing  1 ,  where 


-1+1  =  -a^l+l^-a^l +1^-  ^--a^l+l^-  t-v^l+1^  ^a^l+l^-^ 


ap(t1+i)  =  0^(1,  ) 


(6.14b) 


(6.14c) 


&.<W  « 


(6 .1 4d) 


There  are  altogether  N+l  applications  of  (6.14)  in  the  sequence,  and 
in  each  application,  the  computation  of  the  inverse  of  an  mxm  matrix 
is  involved  (as  indicated  in  (6.14b)).  Hence,  the  total  computer 
time  required  by  the  sequential-reestimation  algorithm  is 

3 

proportional  to  (N+l)m  .  Thus,  when  compared  to  the 
four-dimensional  objective  mapping,  subjective  mapping  with  a  linear 
or  linearizade  dynamical  relation  is  (N+l)2  times  faster.  For 
large  N,  the  computational  cost  saved  by  the  sequential- reestimati on 
alogrithm  in  performing  a  four-dimensional  mapping  can  be 
substantial . 


One  would  probably  consider  using  the  economical 
sequential- reestimation  algorithm  when  the  sound-speed  perturbations 
are  assumed  to  be  produced  by  broad-band  planetary  waves.  However, 
one  must  be  aware  that  the  applicability  of  the  algorithm  depends 
critically  on  the  validity  of  the  linearization  of  the  dynamical 
relation.  When  the  relation  is  nonlinear,  the  error  introduced  by 
the  linearizations  involved  at  each  time  step  demands  special 
investigation,  since  the  error  can  propagate  along  the  reestimation 
sequence  and  be  amplified.  Thus,  the  presence  of  a  mean  flow  in  the 
tomographic  region  can  present  some  difficulties  in  the 
implementation  of  the  wave  dynamics  into  the  transition  matrices 
,  because  the  dynamical  relation  between  the  wavenumber  spectra 
at  different  times  is  nonlinear  when  the  intensity  and  direction  of 
the  flow  are  unknown.  (However,  even  when  the  linearization  is 
invalid,  one  may  still  estimate  the  broad-band  spectra  by  iterative 
minimization  techniques.)  This  broad-band,  subjective  mapping  has 
yet  to  be  performed,  but  it  should  be  of  interest  to  compare  the 
hypothesis  of  broad-band  to  the  hypothesis  of  narrow-band  wave 
disturbances  in  describing  the  mesoscale  fluctuations  in  the  region. 

We  have  used  the  iterative  gradient  method  of  Fletcher  and 
Powell  (1963)  for  our  nonlinear  inversions,  that  is  the  wave  fits. 

In  order  to  obtain  the  gradient  vector  of  the  objective  function, 
which  is  required  by  the  method,  the  gradient  vectors  of  the 
wave-induced  travel-time  perturbations  must  first  be  computed,  which 
involves  integrating  the  derivatives  of  the  wave-induced  sound- speed 


perturbations  with  respect  to  each  of  the  wave  parameters  along  all 
the  long-range  ray  paths  used.  The  method,  therefore,  could  be  very 
inefficient  if  the  integration  operations  were  to  be  performed  at 
each  iterative  step  of  the  minimization  process.  To  accelerate  the 
process,  we  have  precalculated  the  matrix  A  of  the  linear  system 
(6.5)  and  have  it  stored  in  the  computer,  so  that  the  gradients 
could  be  interpolated  by  a  two-dimensional  cubic  spline  whenever 
they  were  needed.  Excluding  the  computer  time  required  to  compute 
A,  each  minimization  consumed  40  to  60  minutes  on  a  VAX  11/780.  We 
also  experimented  the  daily  (I.e.  three-dimensional)  objective 
inversions  using  Gaussian-el imination  techniques  on  the  VAX  11/780 
and  found  that  each  of  the  Inversions  would  consume  approximately  5 
minutes,  again  excluding  the  time  required  to  compute  A.  Therefore, 
by  projection,  the  time  required  to  do  a  broad-band, 
four-dimensional  inversion  using  the  sequential-reestimation 
algorithm  involving  8  time  steps  (i  .e.  N=8)  or  to  do  a  sequence  of  9 
daily  inversions  is  approximately  (N+ I)x5=9x5=45  minutes.  This  is 
quite  comparable  to  the  time  required  to  do  one  minimization  of  the 
objective  function  of  the  wave  parameters,  with  the  same  nuntoer  of 
data  incorporated.  Finally,  the  time  required  to  do  a 
time -dependent  objective  mapping  that  incorporates  the  same  number 
of  data  is  approximately,  again  by  projection, 

( N+l)  **45=81x45=  3645  minutes,  indicating  that  the  computational 
burden  is  huge. 


6.5  Pure  Acoustic  Estimates 

The  spot  observations  contain  some  pieces  of  information  about 
the  waves  which  are  independent  to  those  detected  by  the  acoustic 
array.  In  the  wave  fits,  the  additional  independent  information 
acts  to  enhance  the  uniqueness  and  reduce  the  variance  of  the 
estimates  of  the  wave  parameters  and  the  corresponding  sound- speed 
perturbati  ons . 

Wien  the  spot  measurements  are  withheld,  the  estimates  are 
degraded.  In  Fig.  6.2  and  6.3,  we  show  the  maps  of  the  sound-speed 
estimate  on  yearday  83  and  120  at  a  depth  of  700  m,  generated  by  a 
fit  of  3  waves  of  Model  1  to  the  travel -time  data  alone,  and 
tnerefore  corresponding  to  the  result  of  a  time-dependent  pure 
acoustic  inversion.  The  two  corresponding  error  maps  are  presented 
in  Fig.  6.4  and  6.5,  shewing  the  contours  of  the  standard  deviation 
of  the  error  of  the  sound-speed  estimate.  These  errors  are  about 
naif  the  size  of  those  errors  in  the  time-independent  acoustic  maps 
produced  by  Cornuelle  (1983)  and  Corn uelle  et  al .  (1985),  but  are  2 
times  larger  than  those  of  the  optimal  fit  when  the  spot 
measurements  are  included  (see  Fig.  5.17  to  5.19).  Furthermore,  as 
expected,  the  error  maps  indicate  that  away  from  the  central  region 
of  the  experimental  area,  where  the  ray-path  density  is  low,  the 
mapping  ability  by  the  acoustics  diminishes.  Notice  also  that  the 
errors  on  the  left  half  of  the  square  where  more  ray  paths  have 
traversed  are  slightly  smaller,  as  a  result  of  the  presence  of  the 
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Figure  6.2.  Sound-speed  contours  at  a  depth  of  700  m  of  a  pure 
acoustic  estimate  of  the  wave  field  in  the  experimental  square  on 
year  day  B3.  Contour  interval  is  1  m/s  and  the  reference  sound- speed 
at  this  depth  is  1506  rr/s. 
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Figure  6.3.  Sound-speed  contours  at  a  depth  of  700  m  of  a  pure 
acoustic  estimate  of  the  wave  field  in  the  experimental  square  on 
year  day  120.  Contour  Interval  Is  1  iVs  and  the  reference 
sound-speed  at  this  depth  Is  1506  m/s. 
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Figure  6.4.  Contours  of  the  standard  deviation,  at  a  depth  of  700  m 
in  the  experimental  square  on  yearday  83,  of  a  pure  acoustic 
estimate  of  the  sound-speed  perturbations  in  the  wave  field. 

Contour  Interval  is  0.1  nVs. 


Figure  6.5.  Contours  of  the  standard  deviation,  at  a  depth  of  700  m 
In  the  experimental  square  on  yearday  120,  of  a  pure  acoustic 
estimate  of  the  sound-speed  perturbations  In  the  wave  field. 

Contour  Interval  is  0.1  nfs. 
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receiver  R5. 

In  addition  to  having  a  larger  error  variance,  the  pure  acoustic 
estimate  of  the  wave  parameters  is  also  nonunique.  However,  the 
different  wave-parameter  estimates  do  produce  a  similar  pattern  in 
the  sound-speed  perturbation,  showing,  qualitatively,  an  elliptical 
cold  eddy,  initially  located  at  the  center  of  the  experimental 
square  and  slowly  moving  westward.  Consistently,  Cornuelle  et  al  . 
have  also  observed  a  similar  pattern  from  their  objective  acoustic 
maps . 

Al  though  the  travel-time  data  obtained  from  this  first 
tomographic  experiment  are  not  powerful  enough  to  determine  the  wave 
parameters  by  themselves,  they  certainly  have  contributed 
significantly  to  the  success  of  the  detection  of  the  waves.  The 
dynamical  field  in  the  time  period  separating  the  two  CTD  surveys 
cannot  be  extrapolated  from  the  surveys  alone;  one  can  hardly  deduce 
any  relation  between  the  two  CTD  maps  (Fig.  5.11  and  5.15)  but  only 
to  observe  from  them  that  the  initial  cold  eddy  has  disappeared  and 
a  front  has  appeared  in  the  experimental  square  at  the  later 
period.  Furthermore,  the  moored  temperature  time  series  obtained  at 
three  horizontal  spots  that  only  occupy  less  than  1/4  of  the  square 
cannot  possibly  determine  the  directions  of  wave  propagation.  (The 
fit  with  three  waves  to  just  the  CTD  and  moored  temperature  data  was 
found  to  be  nonunique.)  Thus,  the  travel-time  data  has  provided  the 
essential  information  on  the  westward  movement  of  a  cold  pattern 
that  links  the  other  information. 


We  have  learnt  from  simulation  inversions  that  when  the 
locations  of  the  acoustic  moorings  are  known,  the  wave  parameters 
can  be  uniquely  determined  by  the  travel -time  data  alone.  In  the 
experiment,  however,  the  acoustic  moorings  S4  and  R5  had  no 
mooring -motion  data,  and  all  the  other  acoustic  moorings  had  some 
gaps  in  the  mooring-motion  data  series.  Therefore,  the  failure  to 
tracx  all  the  acoustic  mooring  motions  has  prevented  the  tomographic 
array  to  perform  optimally  in  the  wave  observation. 

New  et  al .  (1982)  and  Munk  and  Wunsch  (1982)  have  studied  the 
horizontal  resolution  of  the  tomographic  configuration  of  the  1981 
experiment  for  a  perfectly  navigated  array,  using  the  Backus-Gilbert 
method  (1967,  1968  and  1970).  By  considering  the  worst  case,  that 
is  without  the  use  of  a  priori  information  such  as  the  temporal  and 
spatial  correlations  of  the  field.  New  et  al.  have  found  a  minimum 
average  resolution  length  of  100  km  (i.e.  1/3  of  the  array  size). 

By  incorporting  spatial  correlation,  Munk  and  Wunsch  have  reported  a 
resolution  length  of  order  50  km.  Thus,  the  tomographic  array  is 
potentially  capable  of  resolving  waves  or  Isolated  oceanic  features 
of  lengths  as  short  as  approximately  100  km.  In  order  to  attain  the 
same  resolution,  a  conventional  spot-observational  system  would 
require  at  least  a  total  of  36  moorings,  that  is  a  minimum  of  one 
mooring  per  50  km  square  (a  criterion  from  the  Sampling  Principle 
(Steiglitz,  1974,  and  Bendat  and  Pierson,  1971)).  In  comparison, 
the  tomographic  system  that  consists  of  only  9  moorings  is  therefore 
more  economical  and  adequate  than  a  conventional  system  for  ocean 
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monitoring  when  the  acoustic  moorings  are  tracked  accurately. 

Besides  resolution,  an  important  measure  of  system  performance 
is  the  variance  of  the  estimate.  For  perfect  navigation  of  the 
acoustic  array,  a  first-barocl inic  perturbation  signal  of  2  n/s 
(rms)  at  700  m  depth,  a  horizontal  Gaussian  correlation  of  the  field 
with  a  decay  scale  of  100  km,  no  correlation  in  time,  and  a  noise 
level  of  5  ms,  the  standard  deviation  (i.e.  the  square  root  of  the 
variance)  of  the  pure  acoustic  estimate  at  a  depth  of  700  m  is 
contoured  in  Fig.  7.1.  It  is  seen  that  over  60  percent  of  the 
tomographic  region,  mostly  In  the  middle  of  the  square,  has  a 
standard  deviation  which  is  below  *0.4  m/s  or  less  than  20  percent 
of  the  signal.  However,  the  error  increases  to  40  percent  near  the 
western  and  the  eastern  boundaries  where  the  arrays  of  sources  and 
receivers  are  located.  The  increase  in  error  is  due  to  the  fact 
that  the  ray-path  density  is  the  lowest  near  the  acoustic  moorings. 
It  is  obvious  that  the  system  performance  can  be  improved 
efficiently  by  mounting  temperature  recorders  on  the  acoustic 
moorings.  In  doing  this,  the  number  of  moorings  used  in  the 
observational  system  stays  the  same  but  the  variance  is  reduced  in 
the  areas  near  the  moorings. 


6.6  Concluding  Remarks 


The  main  purpose  of  this  study  has  been  to  Investigate  the 
existence  and  dynamics  of  planetary  waves  In  the  tomographic  region, 
and  to  find  out  whether  the  waves,  when  present,  could  be  detected 
from  the  data  of  the  experiment.  The  detection  process  consisted  of 
tne  estimation  of  wave  parameters  and  diagnosing  the  plausible  wave 
dynamics  with  the  data.  From  the  result  of  the  estimation,  we  have 
come  to  tne  following  conclusions:  (1)  stable  and  dispersive 
planetary  waves  did  exist,  at  least  as  a  local  phenomenon  in  space 
ana  time,  (2)  the  wave  propagation  was  strongly  affected  by  the 
local  mean  flow,  even  though  the  mean  flow  was  weak  (a  few  cm/s), 
and  (3)  due  to  the  existence  of  some  experimental  deficiencies  such 
as  untracked  mooring  motions,  the  tomographic  observational  system 
alone  was  unable  to  detect  the  waves;  however,  the  spot  observations 
have  provided  the  additional  information  needed  to  make  the 
detecti  on  successful . 

In  this  particular  study,  we  have  demonstrated  the  usefulness  of 
imposing  dynamical  constraints  in  the  inversions  of  data.  That  is, 
by  imposing  different  but  plausible  dynamics,  one  can  learn  the 
dynamics  of  the  field  directly  from  the  inversions.  The 
incorporation  of  dynamics  may  happen  to  convert  a  linear  system  to  a 
nonlinear  one,  as  this  was  our  case,  but  we  should  not  be  disturbed 
by  this  consequence,  since  there  are  many  iterative  minimization 
techniques  available  for  nonlinear  inversions.  However,  in  the 


design  of  future  tomographic  experiments,  it  is  still  recommended  to 
work  witn  linear  systems  whenever  possible,  because  the 
corresponding  sensitivity  analyses  are  much  simpler  and  analytically 
more  tractable  (Ch.  7  illustrates  the  use  of  linear  systems  for  one 
such  analysis). 

A  significant  consequence  of  the  incorporation  of  the  wave 
dynamics  was  the  observation  of  the  barotropic  component  of  the 
local  mean  flow  through  the  dispersion  relationship,  which  would 
otherwise  be  impossible  to  observe  due  to  the  lack  of  explicit 
current  measurements  (unless  some  other  assumptions  were  made,  such 
as  the  level  of  no  motion).  We  have  also  obtained  an  estimate  of 
the  oaroclinic  component  of  the  mean  flow,  corresponding  to  a 
westward  shear  flow  of  the  1st  baroclinic  mode.  Supporting  evidence 
for  the  presence  of  such  a  sheared  mean  flow  in  the  tomographic 
region  can  be  found  inCornuelle  et  al .  (1985):  they  have  computed 
the  difference  between  the  average  sound-speed  profiles  in  the 
tomographic  and  MODE  regions,  and  the  differenced  profile  strongly 
resembles  the  first  baroclinic  mode  (Fig.  3.6);  moreover,  it  is 
negative  and  negative  perturbation  implies  the  flow  direction  is 
westward.  In  Fig.  6.6,  we  show  the  profiles  of  the  mean  current 
obtained  in  the  optimal  wave  fit. 

One  of  our  goals  was  to  Investigate  whether  planetary  waves 
could  be  detected  by  acoustic  tomography  alone.  It  was,  perhaps,  a 
little  bit  disappointing  to  find  out  that  the  tomographic  system 
deployed  in  the  experiment  was  not  able  to  do  so  alone,  that  is  to 


determine  the  wave  parameters  uniquely.  But,  we  must  keep  in  mind 
that  this  was  only  the  first  field  test  of  such  observational 
system,  and  therefore  the  system  was  far  from  being  perfect.  It  can 
be  shown  in  computer  simulations  that  the  waves  would  have  been 
detected  if  the  noise  level  was  reduced  to  ~5  ms  or  the  mooring 
positions  were  accurately  navigated,  suggesting  that  the  tomographic 
system  is  potentially  capable  of  detecting  such  waves  by  itself. 

Obviously,  the  spot-measurement  system  deployed  was  also  unable 
to  detect  the  waves  alone.  The  reason  is  that  the  system  did  not 
obtain  any  information  on  the  wave  field  over  a  long  period  (~40 
days)  between  the  two  CTD  surveys,  except  at  three  horizontal  spots 
where  the  midwater  temperature  recorders  and  sensors  were  moored. 

As  to  the  spot-measurement  system,  the  inclusion  of  the  acoustic 
data  provided  the  missing  information  needed  to  make  the  detection 
successful.  In  view  of  the  pure  acoustic  objective  maps  in 
Cornuelle  (1983),  Cornuelle  et  al.  (1985)  and  the  result  of  our  pure 
acoustic  wave  fits,  we  may  describe  the  acoustic  data  as  containing 
the  information  of  the  westward  movement  of  a  cold  pattern.  This 
information  has  filled  the  gap  between  the  two  CTD  surveys  and  the 
moored  temperature  data  at  three  horizontal  spots  to  give  a  unique 
estimate  of  the  wave  parameters. 

In  retrospect,  the  major  obstacle  to  understanding  the 
large-scale  fluctuations  in  the  ocean  interior  has  been  the 
difficulty  in  observing  them.  Traditional  observational  systems  by 


themselves  are  not  adequate  for  large-scale  monitoring,  because  an 
excessive  amount  of  ship  time  and  too  many  instruments  would  be 
required  to  attain  the  proper  resolution  of  the  field.  The  newly 
invented  technique  of  acoustic  remote  sensing,  however,  holds  great 
promise  (Munk  and  Wunsch,  1979,  and  The  Ocean  Tomography  Group, 
I9b2).  A  full  tomographic  system  is  much  more  cost-effective  than  a 
full  spot-measurement  system  and  has  the  potential  to  provide 
adequate  mapping  by  itself,  as  has  been  demonstrated  by  Cornuelle  et 
al .  (1984).  In  this  study,  we  have  further  demonstrated  that  a 
tomographic  observational  system,  when  incorporated  with  sparse  spot 
measurements  and  the  plausible  dynamics  of  the  field,  is  certainly 
capable  of  making  observations  of  large-scale  phenomena. 


CHAPTER  7 


THE  ERROR  OF  THE  TOMOGRAPHIC  INVERSE  SOLUTION 
IN  THE  PRESENCE  OF  UNTRACKED  MOORING  MOTIONS 

7.1  Introduction 

In  this  chapter,  we  investigate  the  error  of  the  optimal 
solution  sc*  for  the  large-scale  sound- speed  perturbation  sc  in 
space  x=(x,y,z),  attained  via  a  pure  acoustical  inversion  based  on 

the  travel-time  data  at  measured  at  one  moment  in  time.  In 
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particular,  we  study  the  error  variance  <asc*  >=<(sc*-sc)  >  of 
sc*  in  the  presence  of  untracked  horizontal  random  motions  sx  of  the 
moored  acoustic  sources  and  receivers.  Since  we  do  not  consider 
time-correlated  mappings  of  sc,  the  time  dependence  of  sc ,  sx  and  at 
is  suppressed. 

Since  the  observed  travel- time  perturbations  at  contain 
information  on  oceanic  perturbations  integrated  along  the  ray  paths 
and  since  the  integration  automatically  filters  small-scale  oceanic 
perturbations,  at  are  prominent  candidates  for  the  data  to  be  used 
in  estimating  the  large-scale  sound-speed  perturbations  in 
mid-oceans.  However,  in  using  at,  the  fluctuating  horizontal 
motions  of  the  sources  and  receivers  sx  must  be  taken  into  special 
consideration  because  the  dominant  portion  of  at  is  produced  by  6x 
rather  than  <sc.  While  a  horizontal  mooring  displacement  of  200  m 
perturbs  the  travel  time  by  more  than  100  ms,  a  typical  mesoscale 


eddy  field  perturbs  the  travel  times  by  only  about  25  ms  in  a  300  km 
transmission.  The  large  travel-time  perturbations  produced  by  sx 
cannot  be  modelled  as  part  of  the  experimental  noise,  because  this 
will  only  cause  the  estimate  of  sc  to  have  an  unacceptablely  large 
error  variance.  The  vertical  component  of  mooring  motion  is  not 
considered  nere  because  it  is  usually  smaller  and  produces 
insignificant  travel  time  perturbati  ons  . 

In  order  to  estimate  sx  and  sc  accurately,  the  use  of  acoustical 
navigational  systems  for  tracking  mooring  positions  was  recommended 
by  Munk  and  Wunsch  (1979)  and  deployed  by  The  Ocean  Acoustic 
Tomography  Group  during  the  1981  Ocean  Acoustic  Tomography 
experiment.  The  idea  is  to  estimate  sc  based  on  the  corrected 
travel  time  data  in  which  the  large  noise  induced  by  the  mooring 
motion  is  removed.  However,  tracking  data  can  be  missing  because  of 
instrument  failure;  in  that  case,  the  best  estimate  of  sc  is  found 
by  treating  the  travel-time  perturbations  induced  by  sx  also  as 
signals  in  a  inversion  in  which  both  sc  and  sx  are  estimated, 
simul tanueously  (Corn  jell e,  1983  ,  and  also  see  Chapter  4  for  the 
discussion  on  desi gn-parameters  subject  to  errors).  In  this  way,  an 
optimal  estimate  sx*  of  sx  is  also  found;  sx*,  with  no  doubt,  is  a 
reliable  estimate  since  the  corresponding  signals  dominate  in  the 
data.  However,  the  objective  of  Ocean  Acoustic  Tomography  is  to  get 
a  reliable  estimate  of  sc  rather  than  sx,  and  we  can  expect  some 
trade-offs  between  the  quality  of  the  two  estimates,  for  large  sx 
can  upgrade  sx*  and  degrade  sc*  at  the  same  time. 


Although  the  simul  tanueous  estimation  of  _6x  and  6C  is  the  last 
resort  for  missing  tracking  data,  it  is  worthwhile  and  interesting 
in  considering  the  economic  aspects  of  Ocean  Acoustic  Tomography,  to 
ask  whether  rel  iable  mapping  of  6C  can  be  generated  without  the 
deployment  of  navigational  systems  for  tracking  mooring  motions  at 
all.  A  general  answer  to  the  above  question  cannot  be  given  because 
it  depends  upon  particulars:  the  amount  of  available  information 
concerning  <sc ,  such  as  the  statistics  of  its  horizontal  and  vertical 
structure  in  the  ocean  of  interest,  the  smallness  of  the 
experimental  noise  compared  to  the  oceanic  signal,  the  tomographic 
configuration  (geometrical  arrangement  of  acoustic  sources  and 
receivers  in  the  ocean),  and  the  variance  of  ^x  (which 
depends  on  the  type  of  moorings  used  and  the  forces  acting  on  them), 
all  contribute  to  the  answer.  Thus,  a  problem  in  engineering  design 
is  to  decide  whether  tracking  mooring  positions  is  necessary  or  not, 
prior  to  conducting  an  experiment  in  a  selected  ocean,  with  the 
available  statistics  of  ac  and  ax,  and  a  selected  tomographic 
configuration.  The  decision  can  be  made  only  by  computing  <a ac*  > 
in  numerical  simulations  and  seeing  if  the  error  is  tolerable. 

The  main  purpose  of  this  chapter  is  to  show  that  there  is  an 
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upper  bound  for  <aac*  >  as  a  function  of  and  this  upper 

error  variance  bound  is  rapidly  reached  with  slowly  increasing 

2 

ax,  implying  that  the  error  of  sc*  is  effectively  independent 
of  mooring  motion  once  the  latter  has  reached  a  critical  value. 

This  result  simplifies  the  decision  making  process  because  only  the 


upper  error  variance  bound  is  important  for  the  determination  of 
whether  tracking  mooring  motions  is  needed  or  not,  regardless  of  the 
size  of  ox.  In  the  next  section,  the  system  of  equations  with 
unknown  ac  and  ax  are  formulated,  and  the  system  is  then  used  for 
the  derivation  of  the  analytical  expressions  for  ac*  and  its  error 
variance  in  Section  (7.3).  Also  in  Section  (7.3),  the  upper  error 
variance  bound  is  shown  to  exist.  This  upper  error  variance  bound 
coincides  or  approximately  coincides  with  the  error  variance  of  a 
solution  for  ac  that  is  estimated  with  the  "differenced  system". 

The  differenced  system,  in  which  ax  is  eliminated,  consists  of  a  set 
of  "differenced  model  equations"  that  relates  ac  to  the  "differenced 
travel  time  perturbation  data".  In  the  elimination  of  ax,  one  of 
the  model  equations  associated  with  a  resolved  ray  path  for  each  of 
the  source- receiver  (S-R)  pairs  is  used  as  a  reference  and 
substracted  from  the  other  equations  associated  with  the  other 
resolved  ray  paths  for  the  same  S-R  pair.  The  differenced  system, 
its  solution  and  the  error  variance  of  its  solution  are  presented  in 
Section  (7.4).  In  a  computer  simulated  study  presented  in  Section 
(7.5),  we  demonstrate  that  the  upper  error  variance  bound  is  rapidly 
reached.  Conclusions  are  stated  in  Section  (7.6). 


7.2  The  System  With  Untracked  Mooring  Motions 


Suppose  there  are  NS  moored  sources  (SpSg, . . .  .S^)  and  NR 
moored  receivers  (R^,R2, . . . , R^R )  deployed  in  a  typical 
mid-ocean  tomographic  experiment,  and  there  are  N  resolved 
multi  paths  for  each  of  the  S-R  pairs,  so  that  at  an  instant  in  time 
there  are  a  total  of  m=qxN  observed  travel  time  perturbations  with 
q=NSxNR  and  a  total  of  u=2(NS+NR)  unknown  horizontal  mooring 
displacement  components.  Let  6t^  be  the  travel  time  perturbation 
ooserved  from  the  ith  ray  path  in  the  set  of  N  resolved  multi  paths 
that  connects  the  1th  S-R  pair;  this  ray  path  has  a  nominal 
trajectory  given  by  x ( s 7- ^ )  with  s^  being  the  arc  length  along 
the  patn's  trajectory.  Let  us  define  the  1th  S-R  pair  as  the 
Sj -R^  pair  with  l=NR(j-l)+k.  Also,  denote  the 
(eastward.northward)  horizontal  random  mooring  displacements  of  S. 
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and  Rk  as  (ax^ax^j)  and  Ux^ax^j),  respectively ,  with 
s=2j-l  and  r=2NS+2k-l.  It  then  follows  fromCornuelle  (1983)  that 
the  linearized  model  equation  corresponding  to  the  datum  at^  can 
be  expressed  as 
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where  a^  is  the  ray  parameter  (the  sound  slowness  at  the  turning 
point)  of  the  ray  x(s.,  ),  v.,  is  the  experimental  noise  in 


6t.j i ,  Cq(z)  is  the  mean  sound  speed  profile  that  varies  with 
depth  -z,  and  j6-|  is  the  direction  of  the  horizontal  line  of 
transmission  from  Sj  to  R^,  measured  in  degrees  (positive 
anticlockwise)  with  respect  to  the  east-axis  (x-axis). 

Following  Munk  and  Wunsch  (1979),  we  discretize  6c(x)  into  an 
n-dimensional  vector  ic  with  the  components  being  the  sound  speed 
perturbations  averaged  over  small  regions  (boxes)  of  equal  volume  in 
the  ocean,  so  that  the  term  involving  the  continuous  integration  in 
(7.1)  can  be  approximated  as  a  weighted  discrete  sum: 
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with  each  component  in  the  weighting  vector  w^  being  minus  the 
product  of  the  length  of  the  segment  of  s^  and  the  mean- square 
sound  slowness  in  the  corresponding  box.  After  joining  all  the 
6X.j  's  in  the  vector  «x  such  that 


5X  =  ( <sx^  ,6*2 » 


(7.3] 


(7.1)  can  be  approximated,  with  the  use  of  (7.2),  as 
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where  b is  the  weighting  vector  of  6x  that  has  only  four  nonzero 


components:  +  a^cos^  and  +  a^sino^  in  the  corresponding 
columns  as  described  by  (7.1). 

We  can  now  proceed  to  write  down  the  system  of  equations 
appropiate  for  the  tomographic  inversion.  After  segmenting  the 
complete  data  vector  at  into  partial  data  vectors  6t^'s  and  the 
complete  noise  vector  v  into  partial  noise  vectors  v^ 's  such  that 


at.  =  (6til,at.2 . 5tiq)T;  1=1. 2,..., N, 


(7.5) 


-i  =  (vil’vi2’****v1q)  ; 


(7.6) 


and  approximating  all  the  a^  's  by  a  referenced  mean  sound 
slowness  (this  approximtion  has  minimal  effects  on  the  model 
equations  because  all  the  resolved  ray  paths  are  near  axial  ray 
paths  with  small  launching  angles),  the  system  for  estimation  can  be 
expressed  as 


6t  =  F  £  +  V, 


(7.7a) 


ati  A,  B  r  fv, 

—  =  — 2  ’  £=?2^  »  E  =  77  *  *  a  *2 


An  8 


where  A^  is  an  qxn  matrix  with  w^  on  its  1th  row,  and  B  is 


an  qxu  matrix  with  b}.,  on  its  1th  row. 


7.3  The  Upper  Erorr  Variance  Bound 


We  summarize  the  a  priori  information  as  follows:  The  parameters 
<sc  and  <5X  to  be  estimated  have  zero  means  and  a  covariance  matrix 


-E  = 


C 

-<5C 


axiu 


(7.8 


where  C  and  a  I  are  the  covariance  matrices  of  «c  and 

—■ ’  6  C  X— U  _ 


6X ,  respectively,  and  ax  and  6£  are  uncorrelated;  £u  denotes  an 


identity  matrix  with  uxu  dimension.  For  simplicity,  all  the  ax^ 's 
are  assumed  to  be  uncorrelated  with  each  other  and  have  the  same 


variance  a  .  We  further  assume  uncorrelated  experimental  noise 


with  variance  a^  such  that  the  noise  covariance  matrix  is 


-v  =  avim* 


(7.9) 


We  are  new  in  the  position  to  apply  the  generalized  estimation 
procedure  derived  in  Chapter  4,  which  is  the  minimization  of  the 
objective  function  s(£)  of  (4.7),  to  the  present  situation.  Since 
the  model  equations  (7.7)  are  linear,  the  unique  minimum  of  s($_)  at 
£=£*  or  (a£,ax)=( «c*,6x*)  is  the  linear  minimum  variance  estimate, 
and  its  error  covariance  matrix  is  identical  to  the  inverse  Hessian 


j-1 


matrix  H  of  s(j>).  After  replacing  F,  and  £v  in  the 


equation  for  H  (4.22b)  with  their  present  definitions  as  given  in 
(7.7b),  (7.8)  and  (7.9),  we  obtain 


where  CAfiCllr  and  CA6X*  are  the  error  covariance  matrices  of  «c* 
ana  6X* ,  respectively,  and  CAI_*  A,v*  is  the  cross  covariance 
matrix  of  the  errors  of  «c*  and  «x*.  With  the  use  of  matrix 
identities  given  in  standard  mathematical  texts  for  the  inversions 
of  Dlock  matrices,  we  further  obtain 


-2 


-A6C* 


^  -  -1  <  z 


Cf  N 


i=l 


T  N  1 

A*  )G(  £  Aj)]  \ 

i=l 


(7J 


where 


G  = 


(Ba  )(°V  I  +  ^BTB)’1{BalT. 

~x  r_u  x~  -  ~ x 


(7.1 


Furthermore,  from  the  equation  for  £*  (4.22a)  with  jpgsO,  «c*  can 
be  equated  to 


7.4  The  Differenced  System 


The  differenced  system  can  be  expressed  as 


at9-atT 

ni-ni 

= 

—  — 
Ap-Ai 

?r?l 

ac  + 

r  vii 

-6 1  ^ 

,4r*i 

V-l 

in  which  ax  is  eliminated.  Notice  that  the  elimination  is  done  by 
subtracting  a  set  of  model  equations  (ati=A^ ac»Vi  )  from  the 
other  sets  ( at^A^ac+v.,- ).  The  corresponding  estimation  is 
therefore  based  on  the  differenced  data  (st^-at^),  the 
differenced  model  equations  (A  1 -A^ )  and  the  differenced  noises 
(v^-v^).  The  noise  of  the  new  system  (7.18)  is  correlated  and 
has  twice  the  variance  of  the  original  system  (7.7).  The  covariance 
matrix  of  the  differenced  noises  is 


Applying  (4.22a)  and  (4.22b)  to  the  differenced  system,  and 
equating,  the  error  covariance  matrix  of  the  estimated  ac  and 
the  estimate  ac  *  itself  become 


Interestingly,  if  the  product  q  of  the  number  of  sources  and 
receivers  coincides  with  the  rank  of  B  when  q<u,  that  is  when  B  is 
under  determined,  then  we  have  G.,=  I  and  hence  U=U,  and 

U  A 

6c*=<5Ca*.  It  is  always  true  that  U  <  UA,  and  in  fact,  if  a  lot 
of  moorings  are  deployed  so  that  q>>u  and  hence  the  diagonal 
elements  of  Gy  are  significantly  smaller  that  unity,  then 
U«U,.  However,  in  realistic  experiments,  q~u,  implying  that 

u-u. . 

—  —A 

It  is  found  that  the  error  variance  of  «c*  for  given  noise 
level,  a  priori  information  and  geometry  of  the  acoustic  array  is 
oounded  approximately  between  L  and  UA,  as  given  in  (7.11)  and 
(7.20),  respectively,  and  the  error  variance  approaches  the  upper 
bound  UA  as  ox  increases.  If  in  practice  ox  always  exceeded  a 
critical  value  such  that  UA  is  always  reached,  then  UA  can  be 
used  as  a  guideline  in  the  determination  of  whether  the  tracking  of 
mooring  motion  is  needed  for  a  given  experimental  setup.  The 
crucial  question,  therefore,  is  to  find  how  small  that  critical 
value  of  o  is  or  how  large  can  a  be  before  the  upper  bound  is 
reached.  We  will  pursue  the  answer  through  a  computer- simulation 
study  next. 


.  *\  \  O 


7.b  Numerical  Results 


p 

Computer  simulations  are  used  here  to  study  how  large 
can  be  before  C  reaches  U~UA  for  a  typical  situation.  The 

—AoC~  —  —A 

tomographic  configuration  and  58  ray  paths  of  the  1981  experiment 
are  used  in  this  simulated  study;  there  are  q=NSxNR=4x5=20  S-R 
pairs,  u=2(  NS+NR)=2{  4+5)  =18  unknown  ax.j 's  and  about  three  ray 
paths  used  per  S-R  pair;  the  rank  of  B  is  18-3=15.  The  vertical 
structure  of  the  simulated  ac  consists  of  only  the  first  baroclinic 
perturbation.  Horizontally,  the  simulated  ac  is  homogenous  and 
isotropic,  and  has  a  Gaussian  correlation  function  with  a  decay 

scale  of  100  km  and  an  rms  value  of  2  m/s  at  a  depth  of  700  m.  The 

2  2  2 
noise  variance  a y  is  set  to  5  ms  . 

The  covariance  matrices  C  *  for  a  =0,  100  m  and  200  m,  and 
tne  upper  error  variance  bound  are  calculated  numerically.  The 
standard  deviation  of  ac*  (i  .e.  square  root  of  the  diagonal  elements 
of  C  *)  for  o^=0  and  200  m,  and  the  upper  bound  for  the 
standard  deviation  (i.e.  square  root  of  the  diagonal  elements  of 
_U A )  at  a  depth  of  700  m  are  contoured  in  Figures  (7.1),  (7.2)  and 
(7.3),  respectively.  The  rms  errors  of  ac*  versus  ax  at  two 
representative  locations  (a)  and  (b)  in  space  are  also  plotted  in 
Figures  (7.4a)  and  (7.4b),  respectively.  While  (a)  is  located  in  an 
area  with  a  low  density  of  ray  paths  at  the  lower  right  corner  of 
tiie  experimental  region,  (b)  is  located  in  an  area  with  a  high 
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Flgure  7.1.  Standard  deviation,  at  a  depth  of  700  m  In  the 
tomographic  square,  of  the  linear,  tomographic  sound-speed 
perturbation  estimate  In  the  absence  of  untracked  mooring  motion. 
Tne  sound-speed  perturbation  has  an  rms  value  of  2  n/s  and  a 
horizontal  correlation  length  of  100  km.  The  experimental  noise  Is 
5  ms  (rms).  Contour  Interval  Is  0.1  n/s. 


V 


.>v\  ■ 


Q  II  II  II 

4C*  =  (_^p*)[  Y  A^-  h  Y  ^)G(  Y  — i  (7-14) 


Since  «x*  dnd  £A6X*  are  not  our  primary  concerns  here,  their 
mathematical  expressions  are  not  presented. 

It  is  seen  from  (7.11),  (7.12)  and  (7.13)  that  for  a  given 
amount  of  a  priori  information  (£6(J,  a  given  noise  level  (av), 
and  a  given  tomographic  configuration  (which  determines  A-j's),  L 
is  tne  smallest  error  covariance  matrix  of  sc*  that  can  be  attained 
using  Known  mooring  motions.  If  the  mooring  motions  are  known  so 
that  o=0  ana  hence  G=0,  then  Cif/.*=L,  and  CAI„*  increases  as 
o x  increases.  However,  in  the  limit  when  ^x  is  large  enough  so 

O 

that  the  ratios  of  o^  to  the  variances  of  the  signals  produced 

2  T 

by  6X  (the  diagonal  elements  of  a  B  B)  approach  zero,  C  . 
approaches  its  maximum  bound  U  and  it  becomes  invariant  with  ox 
because  G  approaches 


£y  =  B  B  , 


(7.15) 


where  B  is  the  pseudoinverse  of  B,  and  G  is  no  longer  a  function 
of  o  .  This  upper  error  variance  bound  U  of  sc*  can  be  expressed 


U  =  [L"1-N-1a;2( 


y  v]_i- 


(7.16) 


Figure  7.2.  Standard  deviation,  at  a  depth  of  700  m  in  the 
tomographic  square,  of  the  linear,  tomographic  sound-speed 
perturbation  estimate  in  the  presence  of  200  m  (rms)  un tracked 
mooring  motion.  The  sound-speed  perturbation  has  an  rms  value  of  2 
m/s  and  a  horizontal  correlation  length  of  100  km.  The  experimental 
noise  is  5  ms  (rms).  Contour  interval  is  0.1  nv's. 
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Figure  7.3.  Standard  deviation,  at  a  depth  of  700  m  In  the 
tomographic  square,  of  a  linear,  tomographic  sound-speed 
perturbation  estimate  obtained  from  the  differenced  system.  These 
errors  are  approximately  the  upper- bound  errors.  The  sound-speed 
perturbation  has  an  rms  value  of  2  n^s  and  a  horizontal  correlation 
length  of  100  km.  The  experimental  noise  is  5  ms  (rms).  Contour 
interval  is  0.1  nVs. 


Figure  7.4a  and  b.  The  dependence  of  the  standard  deviation  of  the 
linear,  tomographic  sound-speed  perturbation  estimate  at  two 
locations  on  rms  mooring  displacement  in  the  absence  of  mooring 
tracking.  The  figures  show  that  the  upper  standard -deviation  bound 
(  )  is  rapidly  reached.  The  upper  bound  shown  is  approximated 

from  the  differenced  system.  The  soind-speed  perturbation  has  an 
rms  value  of  2  m^s  and  a  horizontal  correlation  length  of  100  km. 
The  experimental  noise  is  5  ms  (rms). 


density  of  ray  paths  near  the  center  of  the  region.  The  upper 
bounds  for  the  standard  deviation  of  sc*  at  the  two  locations  are 
also  plotted  in  the  corresponding  figures. 

It  is  seen  from  the  figures  that  the  error  converges  very 
rapidly  to  the  upper  bound;  the  standard  deviation  of  sc*  for  a 
small  ax  of  only  200  m  is  nearly  equal  to  the  maximum  standard 
deviation.  For  this  particular  experiment,  it  is  indicated  from 
Figures  (7.1)  and  (7.3)  that  in  order  to  estimate  sc  accurately,  say 
to  within  *0.5  m7s,  tracking  mooring  motions  is  required.  Notice 
that  the  regions  with  more  ray  paths  passing  through  them  have 
smaller  errors  only  when  ax=0,  that  is  only  when  the  oceanic 
signals  are  dominant  in  the  data.  This  is  because  as  far  as  the 
estimation  of  ac  is  concerned,  noise  becomes  dominant  in  the  data 
when  mooring  motions  are  not  tracked,  and  since  the  regions  with 
higher  density  of  ray  paths  resolve  more  data  variance,  they  also 
resolve  more  noise  variance  in  this  case. 


APPENDIX 


A  DERIVATION  OF  THE  SEQUENTIAL -REESTIMATION  ALGORITHM 

Let  us  choose  the  free  parameters  of  the  system  (6.11)  to  be  the 
spectral  amplitudes  a(t^)  at  the  final  time  t^  in  the  sequence 
of  observations  and  define  the  functions  s^  of  a(t^)  by 

1 

s(1^La(t1)]=  ^  sk[a(tk)],  (A. la) 

k=0 


where 

sQ[a(t0)j  =  1/2  [«t?t0)-A  a{t0)]Tc;K0)  CitPtQ)-A  a(tQ)] 

+  1/2  a(t0)TC;H0)  a(t0),  ( A.lb) 

sk[a(tk)]  =  1/2  C«t?tk)-A  a(tk)]TCyKk)  E«J:?tk)-A  a(tk)] 

for  k>0,  (A.lc) 

and  a(tk)  with  k<l  is  linearly  related  to  a(t^)  according  to  the 
linear  dynamical  relation  (6.13).  In  (A.lb),  £a(tg)  represents 
the  a  priori  covariance  of  a(t)  at  tQ  or  any  other  time.  With  the 
noise  being  uncorrelated  at  different  times,  it  follows  from  the 
oojective-f unction  approach  that  the  minimum- variance, 
maximum- 1  ik el  i hood  estimate  a*(tN)  of  a(t^)  can  be  evaluated  by 
minimizing 


(A.  2) 


staU^)]  =  slN,[a{tN)]. 

Through  a  Taylor- series  expansion,  we  can  recast  the  quadratic 
function  (A.l)  as 


s^taU^j  =  s(1  )[a*{t1  )J  +  1/2  [a(t])-a*(tj  )]THj[a(tj)-a*(tj )] 

( A.  3 ) 

where  a*(t-i)  is  the  minimum  point  and  is  the  Hessian  matrix 
of  s*1*.  Furthermore,  through  the  use  of  (6.13),  (A. la)  and 
(A. 3),  s<1+1>  can  be  expressed  as 

s(1+1)[a(t1+1)j  =  ^Ca(t1+1)-ap(t1+1)]TC"1(t1+1)  [a(t]+1)-ap(t1+1)] 

+  (A.  4a) 

where 

-P(tl+1 )  3  H|i*(ti )  (A. 4b) 

and 

ytiH,5M15r  (A-4c) 

We  have  neglected  the  constant  term  s^[a*(tj)]  in  writing  down 
(A. 4a);  this  is  of  no  consequence  in  the  subsequent  minimization  of 


s  .  After  setting  the  gradient  of  s  to  zero,  the 
unique  minimum  of  +  ^  is  found  to  be  at 

—  ^1+1^  — v  +1  ^ * 

( A.  5  a ) 

where 

J?l  +i  =  J^a^i  +i^ "^a^i 

( A. 5b) 

It  is  now  clear  that  the  optimal  estimate  aMt^)  can  be  obtained 
by  computing  the  aMt^l's,  that  is  sequentially  minimizing  the 
functions  s^  in  oraer  of  increasing  1.  Each  minimization  in  the 
sequence  can  be  interpreted  as  an  improved  reestimation  of  the 
field.  The  covariance  of  the  field  is  updated  at  ear^  time  step  of 
the  reestimation  process  by  the  information  gained  from  the 
preceeaing  minimizations.  At  the  (1 +l)th  time  step,  using  (A.  4b), 

( A. 4c  )  and  {A. 5),  a  prediction  ap(t1+1)  of  a(t1+1)  is  first 
extrapolated  from  aMtj  )  vrfiich,  on  the  other  hand,  is  an  estimate 
of  alt-j  )  based  on  the  data  obtained  prior  to  ;  then  the 
predicted  value  is  corrected  in  an  estimation  that  uses  the  updated 
covariance  ^(tj+1)  and  the  data  obtained  at  -1+|- 
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